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MAPPING THE DARK MATTER FROM UV LIGHT AT HIGH REDSHIFT: 
AN EMPIRICAL APPROACH TO UNDERSTAND GALAXY STATISTICS 
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ABSTRACT 

We present a simple formalism to interpret the observations of two galaxy statistics, the UV lu- 
minosity function (LF) and two-point correlation functions for star-forming galaxies at z~4, 5 and 6 
in the context of ACDM cosmology. Both statistics are the result of how star formation takes place 
in dark matter halos, and thus are used to constrain how UV light depends on halo properties, in 
particular halo mass. The two physical quantities we explore are the star formation duty cycle, and 
the range of UV luminosity that a halo of mass M can have (mean and variance). The former di- 
rectly addresses the typical duration of star formation activity in halos while the latter addresses the 
averaged star formation history and regularity of gas inflow into these systems. In the context of this 
formalism, we explore various physical models consistent with all the available observational data, and 
find the following: 1) the typical duration of star formation observed in the data is < 0.4 Gyr (Icr), 
2) the inferred scaling law between the observed Luv and halo mass M from the observed faint-end 
slope of the luminosity functions is roug hly linear out to M « lO"-^ - lO^^/i-^M© at all redshifts 
probed in this work, and 3) the observed Luv for a fixed halo mass M decreases with time, implying 
that the star formation efficiency (after dust extinction) is higher at earlier times. We explore several 
different physical scenarios relating star formation to halo mass, but find that these scenarios are 
indistinguishable due to the limited range of halo mass probed by our data. In order to discriminate 
between different scenarios, we discuss the possibility of using the bright-faint galaxy cross-correlation 
functions and more robust determination of luminosity-dependent galaxy bias for future surveys. 
Subject headings: cosmology: theory — dark matter — galaxies: halos — galaxies: formation — 
large-scale structure of universe 



1. INTRODUCTION 

In the last decade, substantial progress has been made 
in advancing our understanding of galaxy clustering in 
connection to dark matter halo clustering. Numerous 
surveys conducted out to 2; ~ 6 [tuniverse ~ 0.9 Gyr) 
have selected large samples to measure the clustering as 
a function of galaxy properties suc h as color, luminos- 
ity, spectra l type, and morphology (Norberg et al. 2001'; 
Zehavi et al. 2002, 2005; Giavalisco & Dickinson 2001 ; 
R)ucaud et al. 2003; Adclbcrgcr et al. 2005; Alle n et al 
Lcc ct al. 2006; Kashikawa et al 
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Coil et al.ll20'06b[ T2008: Yoshida et al. 20Q^. These 
results have convincingly shown that the clustering 
strength of galaxies has a strong dependence on their 
physical properties. In general, the trend goes in a direc- 
tion that more luminous (in the optical or UV) or redder 
galaxies are more strongly clustered in space than the 
less luminous or bluer ones. 
The observed trends of galaxy clustering arc similar to 
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those of halos. The hierarchical theory of structure for- 
mation predicts that the halo clustering is a strong func- 
tion of their masses and assembly history (Mo fc Whitj 
1996: iGao. Snringel. fc Whitd 120051: ICaofc White 200'3; 
IWechsler. Zentner. Bullock. Kravtsov. fc Allgoodii2006[) . 
Because galaxies formed inside dark matter halos, as 
baryonic matter is pulled into the gravitational po- 
tenti al wells of halos, co ols and initiates star forma- 
tion (| White &: ReesI [l978l ). the astrophysical processes 
of galaxy formation are invariably linked to the charac- 
teristics of dark matter halos. The main halo properties 
include their sizes, masses, angular mom entum, assem- 
bly history, and the inte rnal distribution (jNavarro et al.l 
fl997l : [Moore et al.lfl998h . 

Recent evidence has further corrobo rated the halo- 
galaxy connection. IZehavi et al.l ()2004[ ) have measured 
the galaxy two-point correlation function (CF) of the 
Sloan Digital Sky Survey (SDSS) galaxies with unprece- 
dented high precision. From these measures, they have 
detected a small feature in the shape of the correlation 
function at a physical scale of «1 Mpc. The observed 
scale of the bump coincides with the physical scale where 
the transition from the one-halo term {r < 1 Mpc) to the 
two-halo term (r > 1 Mpc) occurs. The former arises 
from the spatial correlation between the parent halo and 
its substructure (subhalos) and between subhalos, while 
the latter arises from the correlation between distinct ha- 
los. Soon after, similar transitions were found at larger 
look-back time s for galaxies selec ted in the rest-frame op- 
tical at 2; ~ 1 (|Coil et al.ll200 6a'). z ~ 2.5 (Quadri et aL| 
120081 ). and in t he UV at z ^ 3 (H ildebrandt ct al. 2007,), 
2 - 4 and 5 (lOuchi et all [20051 : Hee et al.l 1200611 The 
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physical scale of the transition is observed to increase 
with time (decreasing redshift), which is expected be- 
cause halos grow in size. 

The two independent lines of evidence, the observed lu- 
minosity/color dependence of galaxy clustering and the 
detection of a transition scale in the CFs, lend sup- 
port to the close connection between halos and galax- 
ies. A logical next step is to constrain the scaling law 
of the two properties, namely, galaxy luminosity L and 
halo mass M, in order to obtain a more detailed pic- 
ture of the physical processes. Furthermore, by compari- 
son of the scaling laws at different cosmic epochs, one 
can begin to understand the time sequence of galaxy 
formation in the context of the halo evolution (e.g. , 
Zheng et a l. 2007: Whi te et al.ll2007t TConrov et al.ll2007l : 



Conrov fc Wechsler 2008ir 



Many authors have successfully modeled such scal- 
ing relations for l ocal galaxies based on surveys 
such as the SPSS (lYang. Mo. fc van den BoschI 120031 : 
Ivan den Bosch. Mo. k, Yangj l2003af ). They have used 
joint constraints of the observed luminosity function (LF) 
and the clustering m easures for the same galaxie s. Us- 
ing the 2MASS data. lVale fc Ostrikeii IMifih mod- 
eled the scaling relation by directly mapping the shape 
of the halo mass function (the number density of ha- 
los as a function of halo mass: iPress fc S chcchtcrl 119741 : 
ISheth fc Tormenlll999l : ISheth et al.ll2001h to the galaxy 
LF (the number density of galaxies as a function of lumi- 
nosity), assuming that there is a unique one-to-one rela- 
tion between the halo mass and galaxy light. A similar 
abundance-matching method was used to constrain the 
relation of stellar mas ses to halo masses for ga laxies at 
intermediate redshift (|Conrov fc Wechsierll2008l ). These 
models assume that each halo hosts a visible galaxy only 
above the mass threshold of halos (given by the inte- 
grated LF constraint). The assumption is a reasonable 
one in the local universe, because the wavelength ranges 
probed by these surveys trace the general stellar popula- 
tion, in other words, the integrated star formation history 
over the course of the galaxy's entire history, and thus is 
insensitive to the details of a galaxy's recent star forma- 
tion history. Hence, the halo mass, as a robust indicator 
of the area's local density contrast, is well correlated with 
the stellar masses of the galaxy therein. 

High-redshift galaxy samples, however, are often se- 
lected in the rest-UV, which trac es the instantaneou s 
formation of massive stars (e.g., iMadau et al.l Il996f ). 
Moreover, the intrinsic UV luminosity is obscured and 
reddened by dust in the interstellar medium to add 
an ad ditional uncertainty to the halo-galaxy associ- 
ation (jConrov et al.l l2008f l. Hence, at high redshift, 
the modeling of such a relation requires extra cau- 
tion as there is no reason to believe that every halo 
hosts a currently star-forming galaxy. The advan- 
tage, however, is that the same uncertainty that we 
face in modeling this connection will give us impor- 
tant clues to understanding various star formation pro- 
cesses, such as their typical duration and the depen- 
dence of star formation rate on the host halo mass. 
There are reasonable prospects of achieving such a goal, 
as the observed clustering properties indicate that the 
observed UV luminosity (after dust extinction) still 
correlates strongly with t heir clustering strength (and 
thus, with halo mass; e.g.. iGiavalisco fc Dickinsonll2001t 



Adelb er ger et al.l l2005t lAllen et all l2005l: lOuchi et all 
2004b, 2 0051: iLee et al.l 120061: lOverzier et al.l 120061: 
Kashikawa"et all 120061: lYoshida et all 120081) . By us- 
ing these constraints, combined with the UV LF 
meas u red at the s ame c o smic epochs f Gabasch et alj 
|2004l: lOuchi et all |2004al: ISawicki fc Thompson 20061: 



Yosh ida et al.l 12006 HBouwens et al.ri2qq7l: llwata et all 



120071: iReddv et al. r booa lMcL~ et al. r i2oo8n . we can 

constrain the typical duration of star formation as well as 
the physical scaling law between galaxy UV luminosity 
and halo mass. 

Our effort is motivated by the dramatic improve- 
ment in our understanding of dark matter substruc- 
tures from high-resolution dark matter (DM) s i mula- 
tions and analy ti c calculations (Kravtsov et al.' '2004 
Gao et al.l 12004 IPe Lucia et al.. 2004 : Zent ner e t all 



20051 : IWechsler et al.l |2006[) . made more solid given the 



recent tight constraints on cosmological parameter s from 
the WMAP and other recent studies (Spcrgc l~et alJ 
'2003', '2007"; 'Komatsu et al. 2008"). Kravts ov et al.l (|2004l ) 
and Conroy, Wechsler, fc Kravtsov (2006) have demon- 
strated that halos and subhalos identified in DM simu- 
lations provide an excellent match to the observed cor- 
relation functions at all scales (0 > 2 — 3" at z ~ 4, 
for example). The tidal stripping and mass losses of 
small halos occurring as they enter into the potential 
well of a larger halo i s better understood with high- 
resolu tion simulations (|Gao et al.l 120041 : iDe Lucia et al.l 
I2004f ). These dynamical processes may play an im- 
portant role in shaping the observed galaxy statistics. 
The strong dynamic evolution experienced by subhalos 
may not be felt as strongly for the embedded galax- 
ies, because they are more tightly bound at the core 
of the system gravitationally (e.g.. iMoore et al.l Il996l: 
Klypin et al.' '1999"; 'Havashi et all 120031: iKravtsov et all 
1004; Nagai fc Kravtsov 2005|). 

In this paper, we attempt to take advantage of 
the recent progress seen in both the observed galaxy 
statistics, namely, the UV LF and correlation func- 
tions of high-redshift galaxies, and the analogous quan- 
tities for the DM halos, to understand their statisti- 
cal association, and thus, the star formation physics 
of high-redshift galaxies in relation to their local en- 
vironments. Our approach is similar in character 
to halo occupation distribution (HOD) models which 
assume that all galaxies are harbored in DM ha- 
los Ce.g.. iBerlind fc Weinberg) 120021 : iBullock et a"ni2002l : 
IZheng et al.ll2005h . but is generahzed to accommodate 
galaxy luminosity as a joint variable similar to the condi- 
tional LF formalism (fYa ng. Mo. fc van den BoschI I2OOI 
Ivan den Bosch. Mo. fc Y ang 20033). We discuss the de- 
tails of how our approach differs from the standard 
HOD formalism in the next section. We also note that 
our methodology is complementary to ab initio calcula- 
tions of semi-analytic models or hydrodynamic simula- 
tions, for which many detailed physical processes need 
to be modeled to prod uce the observable proper ties of 
galaxies fe.g.. .Kauffmann et al.lfl993t ICole et al.li2000.: 



Somerville et al.ll20qil: IWechsler et aLlboOlUBower et al l 



20061: ICroton et all 120061: iDe Lucia fc BlaizotI l2007l : 
Nagamine et al.ll2007| ). Furthermore, our empirical ap- 
proach will help provide insight into the physical recipes 
implemented in these simulations. 
The main goals of this paper are 1) to build a realistic 
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and empirical model well suited to analyze high-redshift 
data, 2) to interpret the observed galaxy statistics simul- 
taneously in the light of the properties of ACDM halos, 
and 3) to draw general conclusions about the physics of 
star formation when the universe was less than 2 billion 
years old. We will take advantage of the newly avail- 
able observational measures made for the high-redshift 
star-forming galaxies. 

This paper is organized as follows. In Section 2, we 
describe the methodology providing the detailed calcu- 
lations for the model predictions of the galaxy LF and 
correlation functions. Readers who are not interested in 
the details of the methodology may skip to Section 3, 
where we describe the data sets used for the measure- 
ments, and in Section 4, the improved measures of the 
auto-correlation functions as well as the LF and cross- 
correlation function. In Section 5 we report our main re- 
sults, and the results and their physical implications are 
discussed in Section 6. Finally, in Section 7, we present 
the model predictions for the galaxy cross-correlation 
function, which may help overcome the current limita- 
tions of the data to discriminate different physical sce- 
narios of star forma tion. All magnitud es in this work 
are m the AB scale (lOke fc GunnlllQSl . We use a cos- 
mology with = 0.3, Qa = 0.7, erg = 0.9, F = 0.21, 
Hq = lOOh km s^^ Mpc~^ with h = 0.7 and the baryonic 
density = 0.04. 

2. THE FORMALISM 

Here, we present a simple methodology to compute 
three galaxy statistics — namely the galaxy LF, and the 
auto- and cross-correlation functions — directly from the 
predicted dark matter halo properties. We assume that 
all galaxies reside in halos or subhalos, and that there 
exists a broad correlation between the halo masses and 
galaxy luminosity characterized by two scaling laws, the 
mean and the variance of the observed galaxy luminosity 
as a function of halo (or subhalo) mass. We denote the 
mean scaling law as C{M), and the variance as a\{M), 
hereafter. 

The correlation between the UV light and halo mass 
is expected from the observed trend that the clustering 
strength of galaxies at high redshift increases with their 
UV luminosity, similar to that of halos to increase with 
mass. Hence, the mean scaling law is assumed to be 
such that the UV luminosity of a galaxy is an increasing 
function of the mass of its host halo. 

Variance in the luminosity at fixed mass can be ex- 
pected on the grounds of several physical effects. First, 
the UV light emitted by galaxies is obscured by dust in 
the interstellar medium in the random geometry along 
the line of sight. Even if the star formation rate (or the 
intrinsic UV luminosity) depends only on the halo mass, 
dust obscuration would result in a spread in the observed 
luminosity around the mean (the intrinsic UV luminos- 
ity modulo mean dust obscuration). In addition, halos 
of similar masses can have a range of large-scale envi- 
ronments, merger histories, and central concentrations, 
resulting in different rates of gas accretion and star for- 
mation in the galaxies. In this paper we primarily focus 
on variation due to the nature of "typical" star formation 
occurring at high redshift. We assume that star forma- 
tion turns on in a halo at a given point in time, con- 
tinues for a finite time characterized by tsf, and then 



ceases. The galaxy in the halo thus brightens in the 
UV when star formation stars, and subsequently fades 
below the UV detection limit. A simple case where ev- 
ery halo above a given mass threshold hosts a detectable 
galaxy can be incorporated into the general model by set- 
ting TSF ^ survey, whcrc At survey is thc cosmic time 
span covered by a given survey. The latter corresponds 
to the scenario where the star formation turns on at a 
time much earlier than the observed epoch, then does not 
turn off until much later than the observations. The lu- 
minosity variance in this case would instead correspond 
to varying degrees of dust obscuration in these galaxies. 

On the other hand, the variance of the L-M scaling 
relation can arise for another reason if the duration of 
star formation is comparable to or shorter than the time 
span of the observations {tsf ^ ^^sur-uej/)- As the onset 
of the star formation occurs at different times for differ- 
ent halos (of similar masses), the UV luminosity averaged 
over an ensemble of halos of the same mass will have a 
range of values determined by the typical duration of star 
formation as well as how fast these galaxies "brighten" 
and "fade" with time. Another consequence of the finite 
duration of star formation is that it changes the man- 
ner in which galaxies and halos are associated with each 
other. If the typical star formation duration tsf is much 
shorter than Atgurvey, and the SF in each halo turns on 
at a random point in time, some halos may not host a 
detectable galaxy during the observations. Hence, the 
SF duration tsf with respect to the survey time span 
Atsurvey IS related to the ratio of galaxy to halo num- 
ber density Ug/uh. We denote this quantity as the star 
formation "duty cycle" as it is closely related to the SF 
duration throughout this paper. 

Compared to the typical application of HOD mod- 
els of galaxy clustering, our methodology can specif- 
ically encompass the physical parameters relevant for 
high redshift galaxies. In most implementations to date, 
these models have assumed 1) a sharp halo mass cutoff 
to correspond to a luminosity threshold for the given 
galaxy sample, and 2) that every halo above a given 
mass th reshold hosts a visib le galaxy observed in the 
sample (iBerlind fc Weinberg ' 2002: Ham ana et"^l2004 
IZehavi et al.l l2005t iPhlepset al. .2006; Lee et al.ll26oll 
The former is not always assumed (e.g., I Tinker et al.l 
|2005() . however, determining the smoothness of the mass 
cutoff requires a priori physical knowledge as to how mass 
and luminosity are related to each other, precisely the 
knowledge we need to constrain. Because selection ef- 
fects and the physics of star formation are not well under- 
stood, the apphcation of this type of simple H OD model 
may not yield realistic physical parameters (|Lee et al.l 
[2006) . Our methodology also provides a mechanism for 
including the galaxy luminosity explicitly as an addi- 
tional constraint in the model in conjunction with the 
spatial clustering. In a typical HOD framework, this is 
done only in a cumulative sense, (however, different lu- 
minosity samples can be used to characterize how the 
HOD changes with luminosity - e.g., iZehavi et al.„2005t 
ICoil et al.|[2006bHLee et al.ll2006[ ). and not on an individ- 
ual galaxy-to-halo basis. Needless to say, such knowledge 
is crucially needed to understand the galaxy LF in the 
context of the CDM halos. 

We build an empirical model to link galaxy luminos- 
ity and halo/subhalo mass, which in turn can be used 
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to calculate the observable measures, namely the galaxy 
LF and two-point correlation functions. The goal of this 
exercise is to find a range models that satisfy all the 
observed measures simultaneously, and thereby to shed 
light on constraining the physically meaningful L-M re- 
lation (the mean and variance) and star formation duty 
cycle at high redshift. In what follows, we describe our 
formalism step by step. 

2.1. The Ljjv-M relation 

We model the probability density for a halo/subhalo of 
mass M to host a galaxy observed with a UV luminosity 
(denoted at times as i, iiyoo or M1700) to obey a normal 
distribution as follows: 



dP{L\M) = 



VC 



-iL-C{Myf/2al(M) 



27rcrL(M) 



(1) 



where C{M) is the average luminosity where the proba- 
bility density reaches the maximum, a\{M) is the vari- 
ance of the luminosity scatter in a fixed mass M, {{L — 
C{ M)Y). Note that our approach differ s from, for exam- 
ple. | Giavalisco fc DickinsonI (|2001h and iTasitsiomi et al.l 
()2004f ). who adopted a lognormal probability density 
function. Wc discuss the differences in further detail in 
Section 5. The parameter, PC, represents a typical duty 
cycle of the halos (0 < VC < 1)*. Finally, we define 
a total halo occupation efficiency that combines the two 
effects, and denote as e{M) hereafter: 



e(M) 



dP{L\M)dL 



(2) 



where Lq is the luminosity threshold which defines a 
given galaxy sample. If the star formation duration is 
very long (in our definition, VC = 1) and thus every 
halo hosts a galaxy above the mass threshold, then the 
total probability is unity. 

Note that, by construction, our model distinguishes 
two separate components that affect the halo-galaxy as- 
sociation: (1) the typical duration of star formation in 
the halos with respect to the time span covered by the 
survey, and (2) the scatter in the L-M relation due to 
the stochastic nature of the star formation (and dust 
obscuration), as qualitatively discussed in the previous 
section. While the formalism may likely be a simplified 
version of the reality, it offers a reasonable representa- 
tion of the two important elements, the duty cycle and 
scatter, which can then be constrained observationally. 
Our method is intended to minimize the introduction 
of more free parameters, and at the same time, provide 
an inclusive description of all modes of star formation. 
For example, halos with quiescent star formation may be 
close to the mean £(M), while halos undergoing bursty 
star formation or little star formation would fall in either 
side of the tail in the distribution (see later for more dis- 
cussion) . 

2.2. The Total Halo Mass Function and Mass Loss of 
Suhhalo Populations 

* In reality, it is possible that the duty cycle may vary as a 
function of halo mass. However, we chose not to model 'DC{M) as 
it is unlikely to be constrained at least based on the current data. 



Throughout this work, we assume that the galaxy 
UV luminosity is correlated with the pre-stripped halo 
mass, rather than the current one. This is relevant for 
"subhalos" , which can be stripped of a substantial frac- 
tion of the mass they had prior to being accreted into 
a larger system (the "parent halo") via tidal stripping 
and other dynamical processes. A simil ar assumption 
has b een made by previo us authors (e.g. IConrov et al.l 
120061 iBerrier et atl 120061 ). Locally, such an assump- 
tion is rooted in the fact that galaxies are situated at 
the center of halos, and thus more resilient to stripping 
than their host halos, provided that galaxies have assem- 
bled their stel lar p o pulations prior to this event (e.g., 
iHavashi et all 120031: iNagai fc Kravtswl I2005D . On the 
other hand, for galaxies at high redshift observed in the 
rest-UV, the same assumption may not apply, given that 
the instantaneous star formation, and not the light from 
the general stellar population, is traced. However, it is 
encouraging that when the same assumption is made, 
the observed galaxy correlation functions can be repro- 
duced to a re markable precision even at high redshift 
(jConrov et al.| [2006). At these redshifts, galaxies spend 
substantially less time as satellite galaxies, and thus as- 
sumptions about the stripping have less impact on clus- 
tering measurements. 

The total halo mass function, or the number density 
of halos or subhalos of mass M, consists of the halo 
and subhalo contribution. F or the former, we adopt 
the analytic formula given by iSheth fc Tor^ ([1999), 
and for the latter, unevolved s ubhalo mass function from 
Ivan den Bosch. Tormen. fc G iocoli (2005). The latter is 
given in units of the number of subhalos of mass m in a 
parent halo M, denoted here as N{in\M). Then the to- 
tal mass function (MF) is expressed in a simple equation 



nT{M)=nh{M) 



ipMp 



N{M\Mp) nh{Mp)dMp (3) 



where Mp is the mass of the parent halo. Note that 
we have changed notation for brevity, from driT/dM to 
nriM) and drih/dM to nh{M) (the number density per 
unit mass). The upper mass limit ipMp is set by the fact 
that no subhalos should be more massive than a signif- 
icant fraction of their parent halo. Because N{m\M) is 
negligible where m « M , the total halo number density 
nriM) is not sensitive to a particular choice of the ip 
parameter. In our case, the parameter ip is set to 0.5. 

Now we combine the two main ingredients, the L-M 
relation and the total halo mass function, to express the 
number density of galaxies of luminosity L hosted in a 
halo/subhalo of mass M: 

7r{L,M)dL dM = dP{L\M) nriM) 6{L~C{M))dL dM 

(4) 

Note that by defining Equation HI we assume that the 
same L-M scaling law applies to halos and subhalos 
given the same mass, using the unstripped mass is used 
for subhalos. Similarly, we define the number density 
of finding a "central" galaxy of luminosity L hosted in 
a halo of mass M by replacing the total mass function 
nT{M) with the halo mass function nh{M). Now we can 
begin to express the observed galaxy statistics in terms 
of the quantities that we described earlier. These include 
the galaxy LF, the galaxy correlation function at a given 
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Fig. 1. — Uy LF estimates at z ^ 4, 5, and 6, taken from 
IBouwens et al.l II2007I ') and McL ure et"all ^08) in filled and open 
symbols, respectively. The characteristic luminosity A/'^pd and the 
normalization parameter (f>* estimated by Bouwons ct al. (2007) 
are also indicated in left and bottom of the figure for all three 
samples. While the characteristic luminosity increases with cosmic 
time, the faint-end slope a remains constant at a fs; —1.7 through- 
out the probed redshift range (3.5 < z < 6.5). 

luminosity threshold, and the galaxy cross-correlation 
function between different luminosity bins. 

2.3. Luminosity Function 

The LF, (j>{L) or (/)(Afi7oo), can be obtained by inte- 
grating the total number density tt{L, M) over all ranges 
of masses M . In other words, for a fixed luminosity L, 
we sum over the probability-weighted number densities 
of all halos that can achieve the luminosity within the 
allowed scatter: 



(l){L)dL = dL j dM tt{L, M) 



(5) 



A more useful unit to compare with the observations is 
the LF in units of magnitude, 0(Mi7oo): 

0(A^i7oo) = ^ j L dP{L\M)nT{M)dM (6) 

The factor "In 10/2.5 L" comes from changing variables 
from luminosity L to absolute magnitude Mi 700- Note 
that in the limits of the variance cr|^ — > and VG 1, 
the Gaussian probability distribution becomes a delta 
function, reducing the equation to a simpler form: 



1700; 



In 10 

In 10 
T5" 



nT{M)LS{L - t{M))dM 
d\nC{M) 



nriM) 



dM 



(7) 



or (/)(Mi7oo)dMi7oo = nT{M)dM . In this special case, 
there is an exact one-to-one correspo ndence between 
mass M and luminosity L (see, e.g., iVale fc Ostrikeii 
120061 IConrov et al.|[200l : ng{L > L™„) = nriM > 
Mmin) where ut is the total number density of ha- 



los/subhalos, and L,^ 



£(M„ 



2.4. Halo Occupation Distribution 

The first moment of a halo occupation distribution, 
or the average number of galaxies hosted in a "parent" 



halo of mass M, consists of two components, a "cen- 
tral" galaxy situated in the halo itself, and the "satellite" 
galaxy contribution from massive subhalos. As we begin 
our formulation with the (parent) halo and subhalo mass 
functions, we can separate out the two contributions ex- 
plicitly to gain physical insight into their respective con- 
tribution to the galaxy statistics. Not that the HOD 
by definition refers to a galaxies meeting a fixed crite- 
ria, typically those above a luminosity threshold. In this 
case, a galaxy is counted "in" or "out" of the given sam- 
ple depending on its observed luminosity with respect to 
the sensitivity of a survey. In what follows, we derive 
a central and satellite contribution to the HOD for the 
luminosity L > Lq. 

The central contribution, assuming a very long duty 
cycle {DC = 1), and in the absence of scatter, is a simple 
step function. A galaxy will be visible if L > io or 
M > Mq, where Mq is the halo mass corresponding to 
Lo, and invisible otherwise. Thus, (iVg(M)) = e(M - 
Mq). a constant but nonunity duty cycle would lower 
this contribution by a constant factor, while the scatter 
in the L-M relation would alter the shape of the HOD 
by smearing across a range of masses (the probability 
density dP{L\M) in Equation[l] serves as a convolution 
kernel). In the general case, an HOD can be expressed 
as follows: 



(NhiM)) 



L>Lo 



dP{L\M) dL = e{M) (8) 



In the case oi (Jl and DC = 1, Equation [5] reduces 
back to a step function 8(M — Mq). In the nonzero tri,, 
the effect is most pronounced around Mq where Ng will 
be tapered down from its maximum value {'DC) to zero 
within a range of masses determined by the luminosity 
variance (t£ in Equation [TJ 

The satellite component to the HOD, which we de- 
note as {Nsh{M))Lo here, can be derived similarly by 
replacing the step function Q{M — Mq) with N{m\M). 
The total number of satellite galaxies above a luminosity 
threshold Lq hosted by a parent halo of mass M is: 



{Nsk{M))L>Lo = I dL 



dm N{m\M)dP{L\m) 



where (pM is the maximum mass a subhalo can achieve 
within the parent halo M. The total halo occupation 
distribution {Ng{M)) consists of two terms: 

{Ng{M))L>L„ = {Nh{M))L>Lo + {N,h{M))L>Lo (10) 

2.5. Galaxy Auto- correlation Functions 

Once the halo occupation for visible galaxies is deter- 
mined, the galaxy correlation function (CF) can be com- 
puted directly from the HOD. The derivation and cal- 
culation of the CFs that we adopt is mostly a standard 
procedure, but we present several key equations for com- 
pleteness, and highlight some of the features that need 
special attention, namely, the treatment of the second 
moment of the halo occupation distribution, and the es- 
timation of the integral constraints (/C, hereafter). 

The two-halo term (from the spatial correlation of 
two distinct halos) of the galaxy CF for the luminosity- 
limited sample L > Lq, denoted as £,ggQ{r), is the galaxy- 
number-weighted halo correlation function normalized by 
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the total galaxy number density Ug: 

' dMdM'nh{M)nh{M') 



2 ^ggfiV )— 2 



X {Ng{M))o{Ng{M'))o£,hh{r- M, M')(ll) 



{Ng{M))„nh{M)dM 



(12) 



If we define the galaxy- number- weighted halo bias, or 
galaxy bias as, 



{b)o 



^ Shh{M){Ng{M))^nh{M) dM 
J{Ng{M))on,,{M) dM 



(13) 



then the large-scale amplitude of the galaxy CF is lin- 
early proportional to that of the underlying dark mat- 
ter by a constant factor (halo-bias-squared), £,lg o{r) = 

To compute the one-halo term of the CF, we assume 
that the central galaxy is situated at the center of the 
parent halo and satellite galaxies follow the Navarro et 
al. (1997) profile. Then, the one-halo term is expressed 
as 
^2 



n„(M)- 



l)(M)>o 



f{r,M)dM 
(14) 



where the function /(r, M) specifies the net internal dis- 
tribution for the central-satellite and satellite-satellite 
galaxy pairs combined within a parent halo of mass M. 
The same calculation can be worked out in a Fourier 
space: 

n 2 



^ / yik,M)biM){NgiM))onhiM)dM 



X Punik) 

Pgg.oik)^^ l{Ng{Ng~l))^yP{k,M)nu{M)dM (15) 

where y(fc, M) is the Fourier counterpart of the Navarro- 
Frenk- White (NFW) profile, Pun{k) is the hnear DM 
power spectrum, and {Ng{Ng — 1)) is the second mo- 
ment of the HOD which will be discussed in the fol- 
lowing subsection. The total galaxy ACF is £,gg{r) = 
(,gg (r) , and the total galaxy power spectrum is 

The angular correlation function w{9) is related to 
the real-spac e correlation function ^gg(r) ()Limbeij[l953t 
lPeebleslfT980l) as 



eg'gir) 
P 



w{e) 



dz[N{z)] 



dz\ 
d^J 



dk k 
2n 



Pgg,Q{k,z)Mr{z)9k) 

(16) 

where N(z) is the normalized redshift distribution func- 
tion, Jo is the Bessel function of the first kind, Pgg,o{k, z) 
is the galaxy power spectrum for L > Lo as defined in 
Equation 1 151 and r{z) is the radial comoving distance. 

2.5.1. The Second Moment of HOD 

Equations fT4l and fTSl show that the second moment of 
the HOD, {Ng{Ng — l)(Af)), is a major determining fac- 
tor for the one-halo term of the g alaxy correlation func - 
tion. Using N-body simulations, iKravtsov et al.l ()2004D 



showed that the second moment of subhalos populating 
their hosts is Poisson. Because the central galaxy can 
only take nearest integer values, when one includes both 
the central and satellite halos to study the full HOD, the 
second moment is sub-Poisson at low masses, and ap- 
proaches Poisson at higher masses as the number of satel- 
lites {N — 1) begin to dominate the statistics. Overall, 
the second moment of halos/subhalos is well described 



{N{N-l))^{Nf 



1 



for subhalos (17) 



where we denote the halo number as N to distinguish 
it from the galaxy number Ng. Using t he halo catalogs 
created from a simulation described by iWechsler et al.l 
()2006[ ). we have independently verified that Equation [Tfl 
provides a valid description of the second moment for 
halos out to (N) as low as «0.05, in accord with the 
Krav tsov et al.l (120041 ) results, who reported the similar 
results out to (A'^) « 0.01 (see their Figure 4). 

As we base our formalism on the halo statistics to 
predict the galaxy statistics, the second moment of the 
galaxy HOD is modeled to be consistent with these 
findings. An important point to note is that the sec- 
ond moment of the galaxy HOD, {Ng{Ng — 1)(M)), can 
deviate significantly from that expected for the halos 
{N{N-'1){M)) . For example, consider a halo of mass Mi 
with three subhalos. In the presence of a finite duty cycle 
and scatter in the L-M relation, the second moment of 
the galaxy HOD, {Ng{Ng — 1)(A/)), can deviate signifi- 
cantly from that expected for the halos {N{N — 1){M)). 
For example, consider a halo of mass Mi with exactly 
three subhalos. Then the total number of halo pairs is 
4x3/2 = 6. If the galaxy duty cycle is 50% without 
scatter (hence, the total occupation efficiency e = 0.5), 
then the number of galaxy pairs in the same halo is re- 
duced by a factor of 4, because the probability of hosting 
a galaxy is halved (1/e) for both the central galaxy and 
the satellite galaxies. As a result, the pair counts for 
galaxies should be scaled by the factor e^. Hence, one 
needs to quantify how the second moment of galaxies, 
{Ng{Ng — 1)), scales when the second moment of halos 
{N{N — 1)) is distributed to obey Equation [T7] for an 
arbitrary value of duty cycle. 

We carried out Monte Carlo simulations to study the 
effect of the duty cycle and luminosity scatter, and later 
verified the results w ith high-resolution dar k matter sim- 
ulations described in IWechsler et al.l ()2006[ ). We create a 
million halos of the same mass whose mean occupation 
is (N), and populate subhalos for each of the halos such 
that (Nsat) = (N) — 1 obeys a Poisson statistics to sat- 
isfy Equation [TTI Then we randomly assign "galaxies" 
to a subset of these halos/subhalos according to the to- 
tal halo occupation efhciency e to create a mock galaxy 
catalog, and compute the second moment for galaxies 
{Ng{Ng — 1)) when averaged over all halos. We first 
tried both a constant e case (equivalent to a constant 
duty cycle and no L-M scatter) and a more general case 
with a varying e{M) with halo masses. In both cases, we 
find that the second moment is well described by: 

{NgiNg - 1)) « {Ngf " ioT ga^xics (18) 

We have investigated how the internal distribution 
of galaxies should be modeled (assuming an arbitrary 
value of e), when the halos are distributed accord- 
ing to the NFW profile. The internal distribution of 
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galaxies is determined by which of the two pair counts 
(satellite-satellite and central-satellite) dominates the 
total counts. We define the fractional contribution of 
both the terms as: fcs = Ncs/{Ncs + Nss), where Ncs 
and Nss are the mean number of the central-satellite and 
satellite-satellite pairs in a halo, respectively. The quan- 
tity fss is defined analogously for the satellite-satellite 
pairs. We find that these quantities can be written in 
terms of the mean halo number (N) and the total halo 
occupation efficiency e: 



N, 



1 



Nss=-{{N,)-sr = -s'm-ir 

where e, Ncs, Nss, and (N) are all functions of mass M. 
Because both the terms scale with e^, the pair fraction 
parameter depends only on the halo statistics and not on 
the details of how galaxies occupy the halos. Based on 
our Monte Carlo simulations, we model the one-halo term 
seen in Equation [15] as: {Ng{Ng - 1){M)) ^ {Ng{M)y - 
e2(M) and p = 1 if {N{N - 1)) < 1 and p^2 otherwise, 
where p is the exponent to the Fourier-transformed NFW 
profile y{k, M). 

2.5.2. Integral Constraints 

When a model w{6) is to be compared with the obser- 
vational measure Wobsid), the integral constraint (/C) 
needs to be applied to correct for the systematic ofi'set. 
This offset arises from the fluctuations of the density 
field and thus, depends on the survey volume as well 
as the clustering strength (or the variance of the galaxy 
power spectrum within the survey volume^ ct^) of the 
pop ulation in question. Using any observational estima - 



anvj 

tor (|Peebleslll980l: iHamiltonlfToollLandv fc Szalavlfl993l) , 
the true correlation function is related to the measured 



one as: 



Wtrue{0)=Wobs{O) 



,Dim 

' RR{e) 



(19) 



where DD{9) is the number of galaxy pairs with angular 
separations in the range [6 — 69/2, 9 + 66/2], and RR{9) is 
the analogous quantity for randomly distributed points 
in the area of the same geometry. In other words, the 
observed CF needs to be corrected for the bias ct^, and 
then renormalized by the true background (pair) density 
(l+ag). The correction is usually a very small number 

for a reasonably large area {ag <C 1: for the GOODS 
i?435-band dropouts, we estimated ag « 0.012). When 
galaxies are only weakly clustered in space, however, the 
correction could still make a significant contribution to 
the large-scale amplitude of the CF. 

Because the integral constraint depends on the cluster- 
ing strength, the estimation of t he IC from the measure d 
CF itself is an iterative process (jAdelberger et al.ll2005f ). 
This can introduce an additional error to the existing 
measurement error (shot noise and cosmic variance), es- 
pecially when the observed CF itself has a large uncer- 
tainty. Our approach, on the other hand, allows a direct 

^ Note that we denote it as Cg to distinguish from the scatter of 
luminosity— mass relation introduced earlier. 



estimation of the IC from the shape of the (model) cor- 
relation function. For any given model with the L-M 
relation and a magnitude threshold (set by the data set), 
we compute the IC as below: 

(20) 

where i7 is a solid angle spanned by the survey and 
RR{9i) is the number of random pairs at the ith angular 
bin 6i. 

2.6. Galaxy Cross Correlation Functions and Close 
Pair Counts 

The statistical information given by the L-M scaling 
law leads us further into understanding the galaxy cross- 
correlation function (XCF) in the context of halo cluster- 
ing. A cross correlation function can provide useful ex- 
tra information in addition to the auto correlation func- 
tions. In particular, when galaxy (luminosity) samples 
are adequately defined, they can be a more direct probe 
to the halo occupation than the auto-correlation func- 
tions. The size of our data is unlikely to provide useful 
information, however, because the majority of galaxies 
in the sample still falls far on the faint side of the char- 
acteristic luminosity, and therefore the halo masses do 
not differ greatly for the galaxies in the bright bin and 
the faint bin (see later). Nevertheless, larger area sur- 
veys with reasonable depths can test the cross-correlation 
between halos widely separated in masses, and subse- 
quently could break potential degeneracies unresolved by 
using the auto-correlation functions alone. In this sec- 
tion, we present how we compute the cross-correlation 
function, which is generally similar to the procedure for 
the auto-correlation function. 

We define two independent luminosity bins Li < L < 
L2 and L > L3 (L3 > L2), which we denote as the "faint" 
and "bright" sample, respectively. The mean occupation 
number of bright /faint galaxies can be defined similar to 
Equation [5] and [51 except that there is an upper limit 
this time in the luminosity integral: 



(Ng.fiM)). 



dP{L\M)dL 



L-2 i-(pM 

dL I dmN{m\M)dP{L\m) 



(Ng^biM)) = / dP{L\M)dL 



+ / dL 

IL3 



ipM 



dmN{m\M)dP{L\m) {21) 



Similar to Equation pT|) and ([T3|), the two- halo term 
of the cross-correlation function Cb/ ('') is linearly pro- 
portional to the DM correlation function by a constant 
factor 

4'/M = W/(6>.6n(r) (22) 

The average halo biases are computed using Equation ll3l 
with (NgiM)) replaced by {Ngj{M)) or (7Vg,b(M)). 

The one-halo term of the XCF includes two main con- 
tributions: "bright central" -"faint satellite" and "bright 
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satellite" -"faint satellite" pairs^". Hence, the one- halo 
term of the XCF is expressed as 



xf{r,M)dM 



(23) 



where nf, and are the number densities of the bright 
and faint sample (computed similar to Equation [T2)) . 
{Ng f{M))s is the satellite portion of the faint galaxy 
HOD (the second term on the right-hand side of Equa- 
tion [2T|) . Note that unlike the auto-correlation func- 
tions, the XCF depends on the first moment of the two 
HODs, {Ngj) and (Ng^t), and not on the second moment, 
{Ng{Ng — 1)). The total cross-correlation function is the 
sum of these two contributions, £,bf{r) — £,b}{r) +^6/ ('')■ 
The angula r cross-correlat ion function can be obtained 
through the iLimbeil ()1953l ) equation: 

Wbf{e) = J dzN\z) J ^P,f{k,z) Mr{z)ek] (24) 

where the galaxy power spectrum, Pbf(k,z), is defined 
similar to Equation 1151 Finally, the integral constraint 
for the XCF is 



1 J2 



Y.,RR{6,) 



(25) 



3. THE DATA AND SAMPLES 



The data used for the correlation function measures 
consists of the i?435V606*7752:850 imaging data taken with 
the Advanced Camera for Surveys (ACS) on HST ob- 
tained as part of the Great Observatorie s Origins Deep 
Survey (GOODS: iGiavalisco et"aIll2004bD with a signif- 
icant addition of exposure time ta ken as part of the su- 
pernovae search (iRiess et alJi2007il . For the observations 
and data process i ng deta ils, w e refer inte r ested readers to 
IGiavalisco et al.l (|2004bD and iLee et all (|2006[ ). describ- 
ing the previous versions. The total exposure time for 
the vl.9 observations is 3, 3.3, 3.8, 10 orbits (lOcr limits: 
28.2, 28.4, 27.7, and 27.5) for the B435, Veoe, i775, and 
Z850 band. As the data processing and sample selection 
are identical to t he pre vious data prod uct, we refer to 
IGiavalisco et al.l (Hooii) and iLee et all ([2006). The to- 
tal number of galaxies in our samples is 1565 and 1517 for 
the i3435-band dropouts and 658 and 461 for the Vgoe^ 
band dropouts in the north and south GOODS field, re- 
spectively, for the flux hmit of 2:85o< 27.5 («30% im- 
provement for both ^435- and Vgoe-band dropouts from 
the vl.O samples). The total area covered by the two 
GOODS fields are roughly 300 arcmin^. 

The data sets for the UV LF mea sures we adopted 
in our analyses (|Bouwens et al.ll2007D include the same 
GOODS data i n addition to the Hu bble Ultra Deep 
Field (HUDF: iBeckwith et"al] [200l . and the UDF 

When a very large scatter is allowed in the L-M relation, 
there can also be a contribution from the "bright satellite" -"faint 
central galaxy" pairs. However, for reasonable classes of models, 
the probability of such cases is negligible compared to the other 
two, and thus will not be considered. Considering the mass range 
of most halos [Ng < 3) likely probed in the data, even the bright 
satellite-faint satellite pairs should have much lower occurrences 
than the bright central-faint satellite pairs. 



Parallel ACS Fields (UDF-Ps: iThompson eTall [20051 : 
IBouwens et al.ll2003 ). The UDF observations consist of 
56, 56, 150, 150 ACS orbits (lOtr limits: 29.6, 30.0, 29.9, 
and 29.2) in the ^435, Vgoe: *775, and 2:350 band, while 
the UDF-Ps observations consist of 9, 9, 18, 27 orbits 
(lOo- limits: 28.9, 29.2, 28.8, and 28.5 for the maximum 
exposure), respectively, for the same filters. 

4. THE OBSERVATIONAL MEASURES 

For the UV LF measure s in our analyses, we a dopted 
the results presented by IBouwens et al.l (|2007f) . The 
main reason is that they used the same filter set and 
very similar selection criteria to the sample we used 
to measure the galaxy correlation functions (also see 
'Giavalisco et al."20 04aHLee et aXf WOQ). While there are 
minor differences in the color equation s (compare equa- 
tions in ^ 2.3 o f IBouwens et al.l l2007l and those in § 2 
iLee et aL[|2006[) . the estimated redshift distributions of 
the two selections dX z ^ A and 5 are very similar in 
both median and full width at half-maximum (FWHM). 
Hence, the two selections effectively choose the same 
galaxies on both GOODS fields. Furthermore, the in- 
completeness introduced by a particular set of selection 
criteria is corrected to derive the UV LF, essentially re- 
moving the rema i ning m i nor differenc es, as discussed by 
IGiavalisco erall (l2004bD . ISawicki fc'^Thompson L200^), 
and IBouwens et al.l (|2007t l. For galaxies at z ~ 5 and 
6, we supplement the Bouwens et al. (2007) data points 
with those obtained from the UKIDSS Ultra Deep Sur- 
vey (UDS) and Subaru XMM-N ewton Survey (SXDS) 
presented bv lMcLure et al.l ()2008f ). The UDS data cover 
a much larger contiguous area (~0.63 degree^), and thus 
complement the ACS data sets at the bright end. The 
two measures are consistent with each other at the inter- 
mediate luminosity range where they overlap. 

Figure [1] show the LF estimates at z ~ 4, 5, and 6, 
identical to their Figure 3. We also indicate in the same 
figure, their estimation of the characteristic luminosity 
-^1700' a nd the normalization p arameter (f)* for all three 
samples. IBouwens et al.l (|2007( ) found that the faint-end 
slope a remains roughly constant at « —1.7. They have 
also found that the characteristic luminosity consider- 
ably increases with cosmic time from z ~ 6 to 4, while 
the number density at the characteristic luminosity, 0*, 
evolves little. 

For the angular correlation function (CF) m easures, 
we refer interested readers to iLee et ah! (|2006[ ) where 
the method is discussed in detail, namely how the ob- 
served w{9) was derived, and corrected for the integral 
constraint (IC). The new measures are fully consistent 
with the previous ones (vl.O) when the same magnitude 
thresholds are applied (with smaller error bars). Figure 
m illustrates the comparisons of the current and previous 
measures for the full samples of the i?435-band and Vqqq- 
band dropouts (the same IC was applied to both vl.O 
and vl.9 measures for consistency, but see later). Figure 
[3] shows our measures for the three flux limited subsam- 
ples for both ^435- and V606~band dropouts to show their 
luminosity dependence on large scales {9 > 20"). 

In addition to these, we present for the first time 
the galaxy cross-correlation function of two independent 
magnitude bins for the i?435-band sample. The full sam- 
ple is divided into two bins, the bright and faint bin, such 
that the number of galaxies is similar in the two sam- 
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pies (split at 2:850= 26.4) resulting in 692 and 687 bright 
and faint galaxies in the south, an 629 and 746 bright 
and faint galaxies in the north. We con iputed the angu- 
lar cro ss-correlation function using the iLandv fc Szalavl 
(|1993D estimator : 



Wbf,obs{0) 



DMe) - DiR2{0) - RMO) + RMO) 



Fig. 2. — Angular correlation function for i?435-band dropouts (left) and Veoe^band dropouts (right). In each panel, two data sets are 
compared; filled symbols represent the estimation of the correlation function based on the vl.9 data, while open symbols are for the vl.O 
measures shifted slightly to left for clarity. The two measures are fully consistent with each other within error bars. 

then turns over toward high masses to a shallower slope. 
The turnover takes place at a mass scale « 10^^ M0. If 
all the observed galaxies are hosted in halos, and halo 
mass correlates with galaxy luminosity (as confirmed by 
observations), then the existence of this turnover is nec- 
essary to "map" the halo MF to the observed galaxy LF. 
Unlike the observed LF, characterized by an exponential 
decline at the bright end and a shallow power-law slope 
(a > —1.8) at the faint-end, the halo MF has a very 
steep power law at low masses {ahaio < —2.2) and de- 
clines more slowly at high masses. At high redshift, the 
shape of the galaxy UV LF is still well approximated by 
a Schechter function (with a slope a « —(1.6 — 1.8)) and 
the low-mass slope of the halo MF still remains steep. 
Hence, it is reasonable to assume that the L-M scaling 
law at high redshift resembles that of the local galaxies 
discussed in lVale fc Ostrikeil (|2006l) . 

Our modeling of the L-M scaling relation largely com- 
prises two components, namely, what we refer to as the 
average luminosity C{M), and the variance in the lumi- 
nosity scatter cr| (see Equation[T]) . We model the average 
luminosity C{M) as an increasing function of mass with 
a characteristic mass Mqi, and parameterize it as: 

M ^ 



(26) 

where DD is the number of galaxy-galaxy cross pairs, 
DR and RD is the number of galaxy-random, random- 
galaxy pairs, and RR is the number of random-random 
pairs for the group 1 and 2. Figure |4] shows the measure 
corrected for a nominal integral constraints^ of 0.012 (but 
see later). 

5. MODELING THE L-M SCALING RELATION 

So far, we have discussed a formalism to predict three 
galaxy statistics directly from the complete information 
of the halo statistics. We have also presented the ob- 
served measures of the same statistics, which by com- 
paring against the model predictions, can shed light on 
the type of physical models for star formation at high 
redshift, and its dependence on the halo properties such 
as mass. In other words, the main goal is to constrain a 
class of the L-M scaling laws (the mean and variance), 
when used as input to the formalism, that reproduce the 
observed galaxy statistics. Hence, the last piece of infor- 
mation we need is, based on the physical considerations, 
to make an educated guess on the kind of scaling laws 
that we expect between galaxies and halos. 

In local universe, the total (halo) mass to light ra- 
tio (observed in rest-frame optical or near-infrared) 
seem to have a minimum at ~ ^0^.^ -^(^ (e.g., 
van den Bosch ct al. 2G03b; Eke ct al. 2005: Lin & Mohr" 
2004; Lin ct al. 2004.; Tinker et al . 2005; Vale fc Ostriker. 
20061: iConrov fc Wechsleii l2008l ) . Galaxy luminosity in- 



C{M) = Lc 



Mn 



-{M/Moi)-'^' 



(27) 



creases rather steeply with halo masses at low masses. 

Because the IC mainly arises from the large-scale clustering 
and the mean halo bias for the XCF is a geometric mean of that 
of the bright and faint sample, the value should not differ substan- 
tially from that of the ACF for the full sample. However, for our 
modeling, the integral constraint for the XCF, IC'x , needs not be 
derived independently (see Equation I25I I. 



Note that the function has a form of an inverted 
Schechter-like function, which increases as a power-law 
with a slope a/ at hig h masses, and declines towards low 
masses, similar to the lVale fc Ostrikeil ()2006l ) parameter- 
ization. The degree of steepness towards low masses is 
set by Pi, an additional parameter we introduce into the 
conventional three-parameter Schechter function. We re- 
mind readers, however, that this particular parameteri- 
zation is neither unique nor necessary. In fact, the only 
requirement that we impose is that the average luminos- 
ity is an increasing function of mass. Our four-parameter 
function merely serves us as a tool to explore a wide range 
of the four-parameter space from a scaling law much 
like a Schechter function, or a double power-law with 
a knee, or even a single power-law of any slope without 
a turnover. As mentioned previously, and we shall see 
later, the turnover is naturally produced to match the 
halo MF to the observed galaxy LF. 
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Fig. 3. — Two-point auto correlation function measures for the B435— and Vgoe^band dropouts for three flux-Hmited subsamples are 
shown in filled symbols. For comparison, the data points in the full sample (two top panels) are shown in other panels as o pen circles 
(slightly offset in the angular separation for clarity). The nominal integral constraints (IC) estimated from similar samples IILee et al.l 
120061) were applied to each measures. A larger bin size was used for the Veog-band samples for a better S/N. 

As for the luminosity scatter ctl, we parameterize it in 
the same way as the mean: 
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Fig. 4. — Ccross-correlation function between bright and faint 
i?435-band dropouts. The full sample was split into two magnitude 
bins at 2350= 26.4 to define the bright half and faint half. The 
observed XCF was measured, and then corrected for a nominal 
integral constraint IC = 0.012 (see text for the estimation of the 
IC). 



(7l{M) = (7as 



M 



Os 



-{M/Mos 



(28) 



For the luminosity scatter, the requirement we impose is 
that first, it is an increasing function of mass, and sec- 
ond, it must decrease steeply enough towards low masses 
to avoid unrealistic cases where the galaxy statistics are 
dominated by, for example, < 10^ M0 halos (ruled out 
observationally) . Again, the four-parameter model gives 
us the flexibility to explore different forms of scatter, and 
does not necessarily require the existence of any charac- 
teristic mass scale of the scaling law, as it is possible to 
model, for example, a single power-law with a suitable 
choice of the slopes, as and Ps. By adjusting the nor- 
malization parameter <to,si the scatter can be made to 
have a negligible effect (i.e., no scatter model) on the 
galaxy statistics. We also define a fractional scatter at a 
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TABLE 1 

The Range of Parameters Used for the Mean and 
Variance of the L-M Scaling Laws 



loS Lm log Mn 



Oil 



log CTQl 



log Aios 



Minimum 29.10 
Maximum 30.60 



11.50 
13.50 



0.03 0.20 
0.83 0.65 



10.20 
11.20 



27.30 0.03 
28.50 0.83 



Note. — See Equation 1271 and 1281 for the definition. Masses are 
in units of /i~^M0, and luminosities arc in units of erg Hz~^ 



given mass AI to be the ratio of the luminosity scatter to 
the mean, and refer to the quantity as the B parameter 
hereafter (see later): 



no smooth monotonically increasing function will yield 
the reduced chi-sq uare less than 2 f or the data points at 
z 4 presented in iBouwens et alj (|2007l ). This is likely 
a result of a systematic bias introduced in correcting for 
the observational incompleteness combined with Poisson 
noMie in the galaxy number counts. Thus, we define the 
c&fllidence level as Ax^ above the minimum possible 
to' ass ess the fit to the data instead of the actual x^- As- 



6(M) = aL{M)/C{M) 



(29) 



Note that our modeling of the median and variance 
of the L-AI scaling laws allows for a wider range of 
possibilities than most previous works. For example, 
iTasitsiomi et al.l ()2004D have constrained the scaling re- 
lation between the rest-optical r-band luminosity and 
halo circular velocity, by match ing halo circu lar veloc- 
ity function to the r-band LF (Bl anton et al.l 120031 ) in 
the presence of scatter. They assumed a constant scat- 
ter in magnitu de (a lognormal distribution in luminosity: 
also see, e.g., iGiavalisco fc DickinsonI [200lt lYang et all 
|2003() throughout the relevant range of halo circular ve- 
locities — effectively assuming that the fractional scat- 
ter, which we defined as B parameter earlier, remains 
constant. While their one-parameter scatter model is 
much simpler than our four-parameter model, our ap- 
proach is more flexible by allowing scenarios in which 
the fractional scatter can be much larger at some mass 
ranges than others due to, e.g., starbursts (particularly 
suitable for the UV-selected samples). In addition, our 
m odel can be used f or sce narios similar to that discussed 
in ITasitsiomi et ahl (|2004l) by modeling crL(M) appropri- 
ately with respect to the mean. 



RESULTS 



6.1. 



Evolution of LF and Star Formation Duty Cycle 

We begin by demonstrating our formalism with a sim- 
ple case of a constant duty cycle and no scatter, mainly 
to examine if such a model provides a viable description 
of the observations. We assume four duty cycles VC ~ 1, 
0.5, 0.25, and 0.10 (each corresponds to the halo selec- 
tion efficiency of 100, 50, 25, and 10%, respectively). For 
each of the four DC values, we generate a grid of models 
for the average luminosity C{M) by varying four parame- 
ters (see Equation [27|). and compute a LF for each model 
using Equation [71 The model LF is then used to com- 
pute the chi-square to test its goodness-of-fit against 
the observed measure. Table [T] shows the minimum and 
maximum values of all eight parameters that we used to 
create the grid. Parameters outside the specified values 
will result in a LF that is hugely discrepant from the ob- 
servations, and hence the wider range of parameter space 
will not affect any of the results presented below. 

As for the observed LFs, we note that the error bars 
are underestimated (see Figure [ij . We find the mini- 
mum reduced is always larger than « 2 (for example, 
at z ~ 4 we find = 2) for th e best -fit Schcchter pa- 
rameters given in IBouwens et al.l ()2007[ ). In other words. 



suming a normal distribution with 13 degrees of freedom, 
the chi-square distribution function gives Ax^ = 0.949, 
1.524, and 2.185, each corresponding to the 50%, 90%, 
and 99% confidence level, respectively. 

In FigurelSl we show the upper and lower bounds of our 
LF models with th e 90% confidence level together with 
the 'B ouwens et al.l ()2007D measures at z ~ 4. A sofid 
black li ne indicates the best-fi t Schechter fit to the data 
given in IBouwens et al.l (|2007D . Two hatched regions be- 
low the LF indicate the contribution to the total LF by 
subhalos for the two extreme cases {DC — 100% and 
10%). The figure shows that a lower duty cycle (light 
gray) requires a larger contribution from the subhalo 
population than higher duty cycles (dark gray). For any 
fixed luminosity, a lower duty cycle effectively reduces the 
mass threshold above which halos are allowed to host a 
visible galaxy, and results in more satellite galaxies being 
included in the sample. 

The right panel of Figure |5| shows the range of the L- 
M scaling laws for the same models for all four duty cycle 
values (10, 25, 50, 100%). For a fixed mass M, a lower 
duty cycle halo is required to have a higher luminosity 
than its counterparts with a higher duty cycle, in order 
to satisfy the observed LF. Figurse [6] and [7] show analo- 
gous plots for the two higher redshift samples (z ^ 5 and 
6) showing similar trends. We note that the duty cy- 
cle is an input rather than a quantity one can constrain 
when the LF measure alone is used as a constraint. We 
postpone to the next section the range of physical duty 
cycle values where we consider the clustering constraints 
together with the LF. 

The inferred L-M model from the observed LF implies 
that the L~M relation is approximately a power-law and 
turns over around the characteristic luminosity (marked 
as a horizontal line on left in Figure |5|- [7]). Due to larger 
uncertainties in the bright end of the LF, however, the ex- 
tent of the turnover is not well constrained with the cur- 
rent data. For the mass range below that corresponding 
to the characteristic luminosity i*, the power-law slope 
of the L-M scaling law for a fixed luminosity is « 1.2. If 
we consider a case where duty cycle increases continu- 
ously as a function of mass, as an extreme case^^, from 
10% for M - 101° H-^Mq to 100% at M - lO^^ /j-iM© 
the power-law slope is w 0.9. In other words, under 
any reasonable assumptions as to the duty cycle, our 
results suggest that for the majority of galaxies below 
L* , the observed UV luminosity scales approximately lin- 
early with the host halo mass. Hence, in this halo con- 
text, the constant faint-end slope observed from z ~ 3 



to6 (-«« 1.6-1.7:lSteide 


et al.lll999: Giavalisco et alJ 


l2004aHBouwens et al.ll2007l: 


Reddv et al.ll2008D is a result 



of the fact that the power-law slope of the L-M scaling 
law is a approximately unity throughout these epochs. 

^■^ In reality, it is unlikely that the duty cycle can be as low as 
10%, as will be shown in next section. 
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Absolute Magnitude M„qq log (M h/M^^^J 

Fig. 5. — UV LF and L-M scaling relation for galaxies at ^ ~ 4. Left: the range of LF models consistent with the observ ations is shaded 
in gray (90% confidence). Black points and line indicate the observational measure and best-fit Schechter function from iBouwens et al.l 
l|2007). Two hatched regions illustrate satellite contributions for duty cycle 100% (dark gray) and 10% (light gray). For a lower duty cycle, 
the satellite contribution is required to increase significantly. Right: the range of Ljjy allowed for the same set of models as a function of 
mass M. This time, all four duty cycle cases (10, 25, 50, 100%) are shown. A lower duty cycle requires increased luminosity for fixed mass 
(and decreased M/L) in order to reproduce the observed shape of the LF. We also mark the characteristic luminosity M^^qq on left, and 
the corresponding characteristic halo mass for each case on bottom. 
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Fig. 6. — UV LF and inferred L-M scaling laws for galaxies at 2 ~ 5. The LF measures from both lBouwens et al.l l|20071) and lMcLure et all 
1I2OO8I) (shown in filled and open symbols, respectively, on left) are used to constrain the models. 



While the slope of the L-M scaling law remains 
roughly constant, the amplitude of the L-M relation 
seems to change with redshift, for the simple cases we 
consider here. If the star formation duty cycle arises 
from a physical mechanism that does not evolve signif- 
icantly from z ~ 6 to z ~ 4, we can begin to infer the 
evolution of the L-M relation with cosmic time directly 
from the evolution of the observed UV LF. Figure [8] il- 
lustrates this trend for a fixed duty cycle of 50%, but the 
same trend holds for other VC values. Figure [8] shows 
that the UV luminosity at a fixed mass decreases with 
time by a few tenths of a magnitude (right panel) from 
z ~ 5 to 4. This is in qualitative agreement with results 
inferred from a previous clustering study, that for a fixed 
Luv threshold, galaxies at z ^ 5 have average bias con- 



sistent with lowe r halo masses than those at z ~ 3 and 4 
()Lee et al.ll2006[ ). Due to large uncertainties associated 
with the LF measures at z '--^ 6, it is unclear if the same 
trend continues further back in time. 

If we define a characteristic halo mass correspond- 
ing to a characteristic luminosity L* , the same trend can 
be viewed as the characteristic mass decreasing with red- 
shift. Both L* and for each sample are indicated on 
the left and bottom of Figures [5] -[51 In other words, the 
masses of halos that host L* were lower at earlier times 
by a few tenths of a dex (from z ~ 6 to 4, AM^ « 0.5 
dex). Interestingly, the brightening of the characteristic 
luminosity and the dimming of the UV luminosity for a 
fixed mass M, take place in such a way that they com- 
pensate each other, and as a result, produce the roughly 
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Fig. 7. — UV LF and inferred L-M scaling laws for galaxies at z 6. The LF measures from both lBouwens et al.l l[2007l) and lMcLure et al.l 
(|200D (shown in filled and open symbols, respectively, on left) are used to constrain the models. 



constant normahzation parameter 0* throughout these 
epochs, i.e., the number density of halos at a fixed mass 
M increases with time, while the UV luminosity for the 
same mass M decreases with time. 

So far, our conclusions are based solely on the LF 
constraints. In order to draw more physically mean- 
ingful conclusions from our model, which was built to 
bring together all the relevant observational constraints 
into a single framework, we need to consider the cluster- 
ing constraints in conjunction with the LF constraints. 
In Section 16.21 we explore the implications of the ob- 
served luminosity-dependent clustering measures for sim- 
ple cases of a constant duty cycle before we extend our 
analyses to more general cases (discussed in Section l673|) . 

6.2. Luminosity-Dependent Galaxy Clustering 

We compute a set of angular correlation functions w{9) 
for the same models discussed in the previous section 
(Section 16. ip . These models were chosen to match the 
observed LF for a given fixed duty cycle (90% confidence 
limits). A model correlation function was computed for 

each C{M) model (four parameters; see Equation Wl\ 
as described previously, then the integral constraint was 
estimated directly from the model CF. We correct the 
observed CF for the integral constraint before we eval- 
uate the goodness-of-fit against the model w{9). In the 
case of no L-M scatter, most C{M) models have effec- 
tively the same mass threshold, and thus the IC values 
do not vary significantly among different models. Figure 
[S] shows the observational measures at z ~ 4 for three 
subsamples (from left, zsbo< 27.5, 26.5, 26.0) together 
with model predictions for four duty cycle values (from 
top, 100, 50, 25, 10%). Note that the data points in 
each figure are different even though the same data are 
used, because the observed CF is corrected for the re- 
spective integral constraints in each panel. The reduced 
chi-square values and IC values are also shown on the 
upper right corner of each panel. Figure [TO] shows the 
same plot for the Veoe-band dropouts. 

The large-scale {0 > 20"— 30") amplitude in the mod- 
els decreases with decreasing duty cycles as expected. 



This is because the effective mass threshold for halos is 
required to be lower for lower duty cycles in order to re- 
produce the observed total number density (LF). As a re- 
sult, host halos are on average more weakly correlated for 
lower duty cycle scenarios. On large scales, the observed 
measures are consistent with a wide range of duty cycle 
values, and thus do not provide a strong constraint to dis- 
criminating over different models. The relatively small 
area of the surveyed region and a only weak-to-moderate 
strength of clustering^'^ of these faint star-forming galax- 
ies makes it difficult to make a robust estimation of the 
true large-scale amplitude of the correlation function be- 
cause the IC accounts for a non-negligible portion of 
the large-scale amplitude. This can be best illustrated 
by how the data points corrected for the IC follow the 
model curves in Figure [5] and 1101 Surveys conducted 
in larger areas or more strongly clustered galaxy samples 
(brighter star-forming galaxies or rest-frame optically se- 
lected galaxies at high redshift, for example) should be 
less affected by the problem, and thus will provide a bet- 
ter constraint to the models. 

On the other hand, the differences among four duty cy- 
cles are more apparent at small angular scales where the 
amplitude of the CF is much larger, and thus the effect 
of the IC correction is negligible. For the case of a very 
long duty cycle {VC = 100%), the models overpredict 
the small-scale amplitude (x^ « 1.8 ), consistent with th e 
results of dark matter simulations (jConrov et al.|[2006f) . 
As the duty cycle gets lower to 25%-50%, the small-scale 
moves gradually down to be in better agreement with the 
data (x^ ~ 0.7), then goes down below the data for the 
10% duty cycle (x^ ~ 1.6). 

We computed the correlation function predictions for 
the duty cycle values ranging from 5% to 100% with the 
increment of -1-5% for the models reproducing the ob- 
served LF with the 90% confidence level for each given 
duty cycle. Then we computed the chi-square values 

The full sample corresponds to roughly Ri 2.5 h^^Mpc in 
correlation length, much l ower than their brighter counterparts 
■R < 25.5 of 4/j-lMpc I IAdelberger et al.|[2005l: IGawiser eVel] 
l2006i : iLee et^l200g) . 
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Fig. 8. — Evolution of the L-M relation from 2 ~ 6 to z ~ 4 inferred from the evolution of LF is illustrated assuming a constant duty 
cycl e 50% at all epochs. Three short horizontal lines (left) mark the characteristic luminosity L* at z ~ 4, 5, and 6 fromlBouwe ns et al.l 
l|200 7") while three short vertical lines (bottom) mark the corresponding mass (the median value for the allowed models). The left panel 
illustrates the full range while the right panel shows the mass range where the constraint is robust. While the large errors in the z ~ 6 
measures make it unclear whether the trend continues to z ~ 6, change from z ~ 4 to 5 is clear that given the fixed mass, the observed UV 
luminosity was higher at earlier times, in qualitative agreement with what found from a clustering study l)Lee et al.ll2006IV 



of these models with for the highest S /N measures (full 
sample) available to us at z ~ 4 and 5. Figure [TT] shows 
the range of the reduced chi-square values for all the 
considered models. For the i?435-band dropouts, the chi- 
square reaches the minimum at the duty cycle of 30%, 
and increases steeply on either side. The formal la range 
(Ax^ < 1.2) of the duty cycle at z - 4 is PC = 301^^^%, 
and hence the scenarios with extremely short {T>C < 10% 
or long(I?C > 70%) are ruled out at the 90% confidence 
level. For the Vgoe-band dropouts, a similar trend is seen 
even though the observational measures are much noisier 
than the i3435-band dropouts case. Very long duty cy- 
cles (> 80%) are still ruled out based on the correlation 
function measures at z ~ 5 with high significance. 

So far, we have explored simple scenarios where a duty 
cycle can vary, but the L-M scaling law holds a one-to- 
one relation without any scatter. Despite the simplicity 
in the cases discussed in the previous sections, we shall 
see later that the main conclusions do not change sig- 
nificantly when the fully general cases are considered. 
In the next section, we explore more general scenarios 
where the L-M relation can have non-negligible scatter 
component, (Tl(M), in addition to a duty cycle. Due 
to the large uncertainties in the CFs of the Veoe-band 
dropout sample, we focus on analyses of the i?435-band 
dropouts from here on. 

6.3. The Effects of Scatter on the LF and Clustering 
The most general form of our model consists of nine pa- 
rameters, four for the average luminosity C{M), another 
four for the luminosity scatter ctl(M), and a constant 
duty cycle. Hence, it is very time consuming to explore 
the full range of the 9-parameter space. We adopt the fol- 
lowing simplified procedure: first, we generate a random 
C{M) model and construct the corresponding LF (i.e., 
without scatter), then evaluate if the given model can 
be improved by introducing additional scatter gl- For 
example, if the model LF is already predicting a higher 



number density of galaxies than the data, we discard the 
model. The reason is that the introduction of scatter 
effectively runs in one direction, a boost in the number 
density at any given luminosity. Although the luminosity 
scatter can go in either direction, as it is modeled to be 
normally distributed around the mean C{M) the shape 
of the halo mass function implies that the net change in 
the LF in the presence of scatter will always be domi- 
nated by low-mass halos entering into the galaxy sample 
by scattering into a higher luminosity than its mean value 
(increase in number density), and not vice versa. Hence, 
if the model already predicts a higher number density 
than the data without scatter, the fit is always worse in 
the presence of the ctl scatter. 

Once we find a plausible base model for the mean scal- 
ing law, C{M) — five parameters, one for the duty cy- 
cle and four for the mean, are fixed from the shape of 
the LF — we vary models randomly and evaluate the 
change in the LF each time. We repeat the procedure un- 
til either we reach a set of ctl parameters that gives 
equal to or less than the value corresponding to the 99% 
confidence level, or we exhaust all the four-parameter 
space for the scatter and find no suitable model. Fig- 
ure [12] shows one of the models found via this procedure, 
as an example to illustrate the effect of scatter to the 
shape of the LF, the correlation functions, and the in- 
ferred HOD (solid lines) in comparison to the model with 
the same duty cycle and average L-M scaling law, but 
without scatter (dashed lines). Similar random realiza- 
tions were carried out to obtain a few thousand models 
for each of the four fixed duty cycles, and the goodness- 
of-fit was recorded separately for the LF, each of the 
correlation functions, and the cross-correlation function 
(X?/, xL,2,3' and Xwx^ respectively) against the corre- 
sponding observational measures. 

From these models, we have studied the respective ef- 
fects of varying SF duty cycle and the L-M scatter, and 
found that the duty cycle is a major factor in determin- 
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Fig. 9. — Correlation function predictions for four duty cycles (10% - 100%) for galaxies at z ^ 4. The model predictions for the CF 
for three luminosity subsamples are shown together with the observational measures (black circles). Each column corresponds to the same 
data but with the model predictions assuming different duty cycle values (indicated on the left bottom corners), while each row shows the 
CFs for three luminosity samples for a fixed duty cycle. Green shaded regions in each panel indicate the range of the CF amplitude, w{6), 
possible for all the models selected based on the LF constraint (90% confidence) shown in Figure |5] The reduced chi-square values and the 
median integral constraints (IC) estimated from the corresponding models are shown on the upper right hand corner. 



ing the galaxy correlation function on small scales, even 
in the presence of the L-M scatter. Even though the L- 
M scatter also suppresses the amplitude of the one-halo 
term, the joint constraints "preserve" the observed shape 
of the LF by compensating for such suppression. The rea- 
son for this is best illustrated in Figure \T% Any success- 
ful model with a significant contribution from the L-M 
scatter should have a mean scaling law C{M) that de- 
clines more steeply towards low masses (Figure [T^ solid 
line in upper left panel) than that with less contribution 
from the scatter. The dashed line in the lower left panel 
shows the shape of the LF for the same C{M) model. 
Both a steep drop of the contribution from low-mass ha- 
los (upper left), or the low total number density implied 
by the LF (dashed line lower left), result in the same 
consequences: the increase in the median halo masses 
for the observed galaxies. Higher halo masses also imply 
that a larger fraction of halos now contain dark matter 
substructure, and thus a larger one-halo term in the cor- 
relation functions (dashed lines on three right panels). 
In essence, the kind of C{M) models that allow a large 
scatter naturally requires a more pronounced one-halo 
term in the absence of scatter. 
Next, we consider the consequences of adding scatter 



to this particular case (the scaling law for the scatter 
is shown in upper left as a dashed-dot line). The scat- 
ter now allows a subset of relatively low-mass halos to 
increase their luminosity and participate in the galaxy 
sample. As a result, the LF in the presence of scat- 
ter successfully recovers the deficit in the galaxy number 
density needed to agree with the data (upper left). As 
for the correlation functions, the scatter suppresses the 
one-halo term from the no-scatter case (dashed lines), 
again to be more in line with the data — somewhat com- 
pensating for the larger one-halo term required by the 
£(M)-only (nonscatter) model. 

In other words, models with a large scatter (Jl{M) do 
not necessarily imply a smaller one-halo term than no- 
scatter models because the shape of the CF is determined 
by the interplay of the mean and variance of the L-M 
scaling law. Equivalently, there is a degeneracy between 
the two in determining the shape of the one-halo term 
of galaxy CF. This is not so surprising because what 
sets the shape of the observables is the range of halo 
masses producing a luminosity [L,L + dL], rather than 
what the median luminosity C is and how much scatter 
ctl is allowed at each mass. In other words, successful 
models can be found by either "allowing" large scatter 
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Fig. 10. — Correlation function predictions for four duty cycles (10% - 100%) for galaxies at z • 
the previous figure. 
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Fig. 11. — Goodness of fit of the observed measures at 
z ~ 4 and 5 as a function of a fixed duty cycle value: Illus- 
trated are the ranges of the reduced chi-squarc values of our models 
estimated from the observed correlation function measures of the 
full sample. All the models that yield a good fit to the observed LF 
are considered for each of the fixed duty cycles ranging 5% - 100%. 
For the i3435-band dropouts where a better S/N measurement is 
available, extremely long duty cycles (> 70%) and very short duty 
cycles (< 10%) are ruled out at the 90% confidence. A similar 
trend is seen for the Veoe-band dropouts but less definitively due 
to the noisier measures. 

to a fraction of low-mass halos that are otherwise meant 



to host a "too-faint-to-be-detected" galaxy, or by adding 
little scatter to the halos that are already bright enough 
to be detected, and everything in between the two. 

A physical concept of interest is the regularity of the 
star formation intensity. In other words, one can re- 
cast the two scaling laws to understand how bursty star 
formation can be with respect to the mean value £(M) 
in a non-negligible fraction of halos. As a representa- 
tive value, we use the B "burstiness" parameter defined 
earlier (Equation [551). If the Icr scatter is equal to or 
larger than the mean luminosity C{M) (i.e., B > 1), 
then « 16% of all the halos of mass M will host galaxies 
more luminous than or as luminous as its mean value. If 
B is much smaller than unity, most halos have luminosi- 
ties close to their mean value C{M) with little variance. 
The physical meaning of the B parameter pertains to the 
major mode of star formation — a low B corresponds to 
a steady star formation with few outliers with "bursts" , 
while a high B [B > 1) would imply that the star for- 
mation in halos of similar masses can occur at varying 
intensities, the range of which is comparable to or larger 
than the expected mean. Hence, the K-parameter is a 
statistical measure of the mean star formation histories 
of the observed galaxies as a function of halo mass. The 
low-i3 halos, by definition, are quiescent while the high-S 
halos can include bursty galaxies, and thus spanning a 
wider range of UV luminosities. 
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Fig. 12. — Effect of the L—M scatter to the shape of the LF and CFs and halo occupation distribution. Upper left: The 

mean UV luminosity C{M) (solid line) and the itlo" luminosity ranges (dotted lines) for halo mass M are shown for a model as an example. 
The dashed-dot line indicates the scaling law for the luminosity scatter o-l(M) in magnitude units. Three vertical lines (on top) mark the 
luminosity thresholds corresponding to the three subsamples for the observed CF measures. Upper middle: the model HODs with (solid) 
and without (dashed) scatter for the three luminosity thresholds. The HOD plot is rotated to illustrate that in the absence of scatter, the 
three mass thresholds (dashed) correspond to the masses solely determined by the mean scaling law C{M) (solid black line on the top left 
panel) such that iji^t(„-esh = ^{^h,thre3h) where i = 1,2,3. Lower left: the panel illustrates how the shape of the LF is transformed 
by introduction of scatter (solid) in comparison to the absence of scatter (dashed) to be in better agreement with the data (filled circles). 
The number density on the faint end is largely enhanced by scatter. Right: the model CFs for the three luminosity thresholds are shown 
from lowest median luminosity (top right) to highest (top bottom). In each of the three panels, we show the total CF with (solid) and 
without (dashed) scatter. Both one-halo and two-halo terms are also shown in dotted and dashed-dot lines for scatter and no-scatter case, 
respectively. The integral constraint IC is shown in each luminosity threshold for the given L-M laws (top right corner). 



In what foUows, we interpret the L-M scaHng laws 
allowed by the observations (both LF and CFs) in this 
light. Because we do not restrict ourselves to certain 
modes of star formation a priori, acceptable models come 
in a few different classes of solution. These include 
1) models in which the star formation is progressively 
burstier towards low-mass halos and subsides at high 
mass, i.e., B{M) monotonically declining with mass, 2) 
models in which the star formation is bursty only in lim- 
ited range of masses — B{M) with a minimum, 3) mod- 
els that are increasingly burstier at higher mass halos — 
i.e., B{M) monotonically increasing with mass. We note 
that the classification of these scenarios are somewhat 
arbitrarily made to highlight the overall trend with halo 
mass, and thus one scenario is not clearly separated from 
other scenarios as can be seen in Figure fT3l In the fol- 
lowing section, we further examine different scenarios in 
light of physical considerations, and discuss how uncer- 
tainties can be better constrained by future surveys and 



other available data. 

6.3.1. Declimng B{M) with Mass 

The first class of models correspond to a case where 
galaxies in low-mass halos {M < 10^"'^ H^^Mq) have 
burstier star formation while those in massive halos 
(M > 10^^'^ H'^Mq) have more regular star formation, 
close to the median value C{M). Figure [T51 illustrates the 
range of the B parameter (upper left) and the la upper 
limit on luminosity (lower left), £{M)+aL{M), achiev- 
able for halos of mass M satisfied by the observational 
constraints at z ~ 4. Both quantities are expressed in 
units of magnitude. We also show the inferred HODs 
for the three observed luminosity thresholds from these 
models when the duty cycle 50% is assumed (right). At a 
50% duty cycle, halos of mass 10^" H'^Mq can brighten 
by ~2 mag or higher above its mean, while halos of mass 
10^^ Mq can only brighten up to a maximum Amag ~ 1 
or w 40% of its mean luminosity. Similar to the duty- 
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cycle only models, the total luminosity "£(M)+(Ti(Af)" 
is required to be larger for the low duty-cycle cases in 
order to preserve the shape of the observed LF. 

This class of models corresponds to a physical scenario 
in which high-mass halos have a steadier accretion of 
gas for star formation than their lower mass counter- 
parts. Hence, the former is well described by a nearly 
constant star formation history, while the latter is char- 
acterized by a shorter e- folding time Tgp. Because SF 
episodes take place at random times for different ha- 
los, when averaged over an ensemble of halos, the re- 
sult is the overall increase of star formation rate with 
halo mass together with the decrease of the fractional 
scatter B with mass. This scenario is perhaps in quali- 
tative agreement with the current framework of galaxy 
formation where more massive systems have higher in- 
fall rates (at these redshifts, of both dark matter and 
baryons) than less massive ones. Alternatively, the star 
formation may be temporarily quenched in low-mass ha- 
los a s they are more su sceptibl e to super nova feedback 
(e.g.. IStinson et ahl 12007; Sca nnapieco et a l. 2008) until 
the critical surface density is reached again to start an- 
other episode, or temporary enhancement in star forma- 
tion rate occurs d ue to the fragmentatio n of their pri- 
mordial disks fe.g.. lBournaud et al.ll2007l ). 

6.3.2. Increasing B{M) with Mass 

The second scenario consists of models for which the 
B parameter is negligible at low masses. Figure [l3] illus- 
trates the range of physical parameters for all the mod- 
els in this category. Note that while the ,B-parameter is 
mildly increasing with mass, the value is quite low even 
at the highest masses (Bmax ~0 mag or aL,max ~ -C), 
and the logarithmic slope is extremely shallow. The 
maximum slope for the ;B-parameter allowed by the ob- 
servations is ^0.28. This is a consequence of the ob- 
served luminosity-dependent clustering and LF. More 
specifically, any model that is increasing more steeply 
than these would contradict the observed luminosity- 
dependent clustering, not to mention that it would pro- 
duce excessively high number densities at the bright end 
of the LF. Because in this scenario most halos are not 
allowed to have a large scatter, the shape of the L-M 
scaling law (lower left) is such that both the low-mass 
and high-mass slopes are steeper than the other two sce- 
narios (Figure [T5)l . 

A plausible physical process likely to result in such a 
scenario is an extra contribution from a merger-induced 
star formation combined with a more regular channel 
of star formation via gas accretion. In the ACDM cos- 
mology, merger r ates increase mild ly with halo m ass at a 
given epoch (e.g..lNeistein fc Dekei2008l : lFakhouri fc Mai 
l2008HStewart et al.ll2008[ ). which could cause the merger- 
induced star formation also to increase very shallowly 
with mass. The main difference of this scenario from 
the previous one is that the negligible B or scat- 
ter at low masses is required in this case. A low B- 
parameter implies that the contribution to star forma- 
tion from smooth gas accretion has to be rather regular 
even at very low masses. In other words, cold gas, which 
subsequently gets converted to stars, has to be contin- 
uously trickling in at all times, and thus most galax- 
ies should have roughly constant star formation histo- 
ries. It is not clear whether such regularity is possible 



in hydrodynamic simulations, not to mention the ex- 
tremely shallow logarithmic slope of the L-M relation 
{B{M) oc M°-^* or shallower) inferred from our data. 
An alternative scenario consistent with the model in- 
cludes the q uenching of SF and the subs e quent bursts 
proposed by iBirnboim. Dekel. fc NeisteinI (j2007[ ) which 
preferentially occur in high-mass (> lO^^/i~^M0) halos. 
However, it is unclear what kind of mass dependence the 
proposed process would exhibit. 

6.3.3. A Hybrid Model 

The third case (Figure [T3|) presents the scenario in 
which the S-parameter at first decreases steeply with 
mass up to ~ lO^-'^-^ h~^MQ, where it reaches the min- 
imum, and then increases again towards higher masses. 
Again, the logarithmic slope for the high-mass end is re- 
quired to be shallow with the maximum slope ~ 0.35 to 
be consistent with the data. The competition between 
the two processes results in a range of halo masses at 
which the S-parameter reaches its minimum (10^^'^ — 
10^2 /i-^Mq). Because the mass 10"-^ /i-^Mq corre- 
sponds to the absolute luminosity Mi 700 ^-20.0, much 
brighter than the range we are able to probe with the 
observed CFs, however, it is virtually indistinguishable 
from the first scenario with the current data alone. Over- 
all, the scenario is a hybrid of the previous two cases 
representing the two competing processes dominant at 
different mass scales. 

At this time, we are unable to discriminate between 
these three models with drastically different physical im- 
plications. This is partly due to the degeneracy between 
the effect of the two L-M scaling laws, C{M) and (Jl{M), 
to the shape of the galaxy auto-correlation function. As 
a result, the SF duty cycle is a more robust constraint 
than the particular "type" of the L-M scatter. In or- 
der to break this degeneracy between physical models, 
we explore the behavior of different models in a higher 
luminosity regime in the next section. 

6.4. Breaking the Degeneracies between Physical Models 

Halo bias increases much more steeply at high masses 
compare d to lower ma s ses. A similar trend was also mea- 
sured bv lZehavi et al.l ()2002l ) locally that the correlation 
length of galaxies increases significantly more steeply for 
L > L* galaxies. Hence, large luminosity scatter at high 
masses will render the luminosity-dependent bias to in- 
crease more mildly than that expected for the cases with 
little scatter. In Figure[T31 we show the range of the aver- 
age bias values for the _B435-band dropouts as a function 
of zssQ-band magnitude threshold for three scenarios. As 
expected, the bias values for the Scenario 1 are higher 
than the other two for a given luminosity for bright galax- 
ies Z85o< 25.0 (corresponding to «L*). Other surveys 
covering much larger area than the GOODS data should 
be able to place a strong constraint o n this regime. For 
example, according to Bouw ens et al.l (|2007[ ) estimate of 
the i3435-band dropout surface density, the COSMOS 
survey should already have « 1400 galaxies brighter than 
2775= 24.5 over the 2 deg^ field. On the other hand, in or- 
der to distinguish Scenario 2 from Scenario 3, one needs 
to constrain the luminosity dependence on the faint end. 
As can be seen from the figure, the effect is much more 
subtle because halo bias increases only very mildly at 
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Fig. 13. — Three physical scenarios for star formation at z ~ 4 Left: the lower left panel shows the Icr upper limit on luminosity 
for given halo mass AI for duty cycle 50% for three scenarios. While the scaling laws for the mean C{M) and variance (t£(M) range vastly 
differently for different scenarios, the upper limit show similar behavior (see text for more discussions). The upper left panel shows the 
range of B parameters implied by these models in units of magnitude. For example, halos of IO^^/i^^Mq can be brightened by more than 
fs;2 mag above its mean while halos of mass 10^^/i~^Mq can only brighten up to fsl mag, or 40% of its mean luminosity, to be consistent 
with the observations. Right: the three right panels present the range of halo occupation distribution allowed for the three luminosity 
thresholds used for the data for each of the physical scenarios. The solid colors show the total HOD, while hatched curves show the satellite 
contribution. Note that the central term of the HOD for Scenario 2 rises more steeply than the other two as nearly no scatter is expected 
at low masses. 



low masses. We also note that the bias values are larger 
for higher duty cycle cases (compare the top and bottom 
panels), because higher duty cycle case implies higher 
median halo masses included in the sample. 

Another observational measure we explore is the 
bright-faint galaxy cross-correlation function. The cross- 
correlation function delves directly into the L-M rela- 
tion and the association of bright "central" and faint 
"satellites" in the same halo. Hence, it should be more 
sensitive to the halo occupation distribution within, and 
the galaxy density profile within the halos. Wc com- 
pute the galaxy XCF as described in Section 2 for the 
models which successfully reproduce the LF and auto- 
correlation function constraints. Figure [15] shows the 
model predictions of the XCFs for the three physical sce- 
narios discussed previously (Figure when the duty 
cycle 25% (right) and 50% (left) are assumed. The ob- 
served cross-correlation function measure is also shown 
in filled squares. 

Both duty cycle values are a reasonably good fit to 
the data given the error bars (the median reduced 
values are «0.7 for all three cases). The one-halo term 
of the XCFs shows a slight hint of a different slope in 
each scenario, but it is a negligible one. Even if the 
measurement errors were half the current values, the dif- 



ferences between the physical models would be too small 
to be detected observationally. On the other hand, the 
large-scale amplitude, or the two-halo term, makes no 
significant difference at all between different scenarios. 

It should not be surprising, however, that it is not pos- 
sible to discriminate between models with the current 
data. The main reason is that our sample is dominated 
by galaxies much fainter than the characteristic luminos- 
ity. The halo density profile (which we assume galaxies 
follow) is mass-dependent in a way that the in ner slope is 
shallower for high-mass halos ([Navarro et al.l[l997i) . but 
in order to see such an effect, one needs to probe the mass 
regimes with a noticeable change in the profile. Hence, 
once we move into a much brighter regime (L > L*), 
one should be able to constrain the different classes of 
physical models we discussed. 

We demonstrate in Figure [T6| the expected shape of 
the XCFs when the bright sample used for the cross- 
correlation includes much brighter galaxies {L > L*) 
than the current sample. The same models discussed 
previously (Figure [T3[) are used to compute the XCFs for 
different luminosity thresholds where the bright sample 
consists of galaxies of luminosity, Mjjv < -^j/y + 0.35, 
M^y — 0.15, and M^y — 0.65, while the same faint sam- 
ple is used for all three cases, Mjjv > M* + 1.50. It can 
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Fig. 14. — Model predictions of galaxy bias as a function 
of magnitude threshold at z r^ 4 We show the luminosity de- 
pendence of galaxy bias for the three physical scenarios consistent 
with the data (see text for discussions) when the star formation 
duty cycle 25% (top) and 50% (bottom) is assumed. Scenario 1 
(dark blue) exhibits the strongest luminosity dependence for galax- 
ies brighter than characteristic luminosity 2850~ 25.0 because little 
scatter is allowed at high masses. On the other hand, Scenarios 2 
and 3 show milder increase in the galaxy bias as a function of lu- 
minosity threshold because larger scatter allowed in these models 
(see Figure [T3t dilutes the strong mass dependence of halo cluster- 
ing. We also note that bias values for higher duty cycles (bottom) 
should be higher than lower ones (top). Hence, by measuring the 
luminosity dependence of galaxy bias accurately for bright galaxies, 
we can discriminate different physical models for star formation. 

be seen from the figure how the one-halo term for Scenar- 
ios 1 and 2 separates from one another as the luminosity 
threshold increases. For other surveys (e.g., COSMOS) 
or future surveys, for which a much larger number of 
bright galaxies {L > L*) will be available, the cross- 
correlation function measures and more precise determi- 
nation of luminosity-dependent bias can be effectively 
used to constrain the correct physical model governing 
the star formation in these galaxies. 

7. DISCUSSIONS 

We presented a simple formalism that allows us to con- 
sider all the available galaxy statistics at high redshift, 
and thereby to extract a set of useful physical information 
governing the star formation processes in these galaxies. 
The formalism provides an empirical tool to understand 
the results of the complex physics of star formation in 
these galaxies from the halo perspective, and thus is com- 
plementary to the ab initio calculations of semi-analytic 
models and hydrodynamic simulations. Our method- 
ology has several advantages over the most commonly 
used methods for constraining halo occupation distribu- 
tion at high redshift. Unlike the HOD formalism, our 
method allows the scatter in galaxy luminosity and halo 
masses, and thus provides a more realistic representa- 
tion of the galaxy-halo association. Not only do we allow 
the L-M scatter, but also by using several observational 
constraints simultaneously, we are able to constrain the 
range of scatter with respect to the mean, an important 
clue to the nature of star formation in these galaxies. 
Furthermore, the explicitness of the L-M relation in the 
model allows us to connect three of the important galaxy 
statistics commonly measured in surveys, and thereby 
bring these statistics closer together to help provide a 
physical picture of the universe. 

The key questions we try to answer in this work in- 
clude: 1) the typical duration of star formation in these 
galaxies, or their effective occupancy in halos at the given 



cosmic time; 2) how the observed UV luminosity corre- 
lates with the masses of their host halos, and how such 
a relation evolves with cosmic time; and 3) the main 
mode of star formation for these high-redshift galaxies — 
namely, are most galaxies observed in our survey "burst- 
ing" with star formation and thus atypical beings from 
the rest of the halos of similar masses, or do they mainly 
form a "main sequence" of star formation with few out- 
liers? Here, we summarize our findings, and discuss the 
physical implications for each of these questions. 

7.1. Star Formation Duty Cycle at High Redshift 

The star formation duty cycle, in our formalism, is 
measured in units of the ratio of the number density of 
the observed galaxies to that of halos in the same cosmic 
epoch. If all halos and subhalos host a visible galaxy, 
then the duty cycle would be unity. Hence, once the star 
formation is initiated, statistically it would rarely fade 
below the survey sensitivity at least within the cosmic 
time span our survey probes, and thus the SF e-folding 
time for most galaxies should be significantly longer than 
the time span of the survey, tsf ^ i-^t survey Our re- 
sults rule out such a scenario, based on the shape of the 
two-point correlation function shown in Figures [9l and fTOl 
with the 95% confidence. As a second example, one can 
consider a case where tsf ~ survey Because the star 
formation in each halo must turn on at random times 
(independent of the start/finish time of our survey) and 
lasts for « Atsurvey, it is easy to show that the mean 
duty cycle in this case should be 50%. 

The best-fit duty cycle values for 2; r^ 4 is 15 - 60% 
(Ict). Our measures for the Vgoe-band dropouts also rule 
out scenarios with very long duty cycles {VC > 80%) 
even though the measurement uncertainties are too large 
to make robust constraints at z ~ 5 (see Figure [TT|) . In 
units of cosmic time, these correspond to 0.1 — 0.4 Gyr 
for .z - 4 and < 0.35 Gyr for z - 5, when the FWHM Az 
of their respective redshift distribution is used as a repre- 
sentative time scale for our survey. Hence, we find that 
the star formation duty cycle does not seem to evolve 
significantly from z ^ 5 to z ^ 4, and is consistently 
shorter than a few tenths of a billion years. The rela- 
tively short time scale during which galaxies are visible 
in the UV implies that the galaxies observed at z ~ 4 are 
unlikely to be the direct descendants of those at z 5, 
as the latter is likely to fade into a lower luminosity in 
the UV wavelengths by z r-^ 4, or have moved onto the 
next stage in which it would no longer satisfy the LEG 
selection criterion unless star formation is recurrent. 

7.2. The L-M Relations and Evolution of the UV LF 

It is interesting to note that the star formation duty 
cycle is the most robust quantity that we are able to 
constrain based on the current data. The reason for 
this is the degeneracy between the mean UV luminosity 
C{M) and the luminosity variance a\{M) in the shape 
of the two-point correlation function, as discussed exten- 
sively in Section [6?3l While the introduction of the L-M 
scatter generally suppresses the one-halo term from the 
same base model without scatter, the models with a sig- 
nificant scatter also prefer the mean scaling law C{Ad) 
with a larger one-halo term than those with little scatter 
(see Figure [12]). As a result, the shape of the CFs with 
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Fig. 15. — Galaxy cross-correlation function for galaxies in the GOODS fields: Model predictions for the galaxy cross-correlation 
functions are shown together with the data for the three physical models considered (see text). The left panel shows the duty cycle 50% 
case. The one- and two-halo term are shown as line- filled regions as well as the total CF (solid color). The difference in the shape of the 
one-halo term shown as between three models is too small to be measured observationally even if the better precision is warranted. As for 
the two-halo term or the large-scale amplitude, there is virtually no difference in all cases. The right panel shows similar predictions made 
for the duty cycle 25%. Note that all the models shown were chosen based on the goodness-of-fit to the LF and auto-correlation functions, 
and not based on that for the cross-correlation function. Nevertheless, the models are reasonably good fits to the data. 

and without scatter changes httle. Simply put, the LF 
constraint requires that different L-M scaling laws are 
preferred for the models with scatter and those without 
one. Hence, the shape of the CF cannot unambiguously 
determine what type of "scatter model" is favored, while 
the duty cycle and the la upper limit on UV luminosity 
achievable for halos of mass M — C{M) + a l{M) — can be 
determined robustly. 

In this work, this inherent degeneracy was further ex- 
acerbated by the uncertain determination of the true 
large-scale amplitude of the CFs. The relatively small 
area 300 arcmin^) and the weak clustering strength 
of galaxies sampled in our survey, result in the correction 
(integral constraint) that is an apprec iable amount to 
the t rue clustering strength (see, e.g., ISomerville et al.l 
|2004[ ). Hence, the large-scale measures of the CFs tend 
to agree with our model predictions over the wide range 
of duty cycle values (15 — 60%: see Figure l9l [T0| . Future 
works based on larger surveys (e.g., COSMOS, NOAO 
Deep Wide-Field Survey) will likely make a more robust 
determination of the duty cycles (for very bright LBGs) 
as well as test the validity of our formalism — namely, the 
equal treatment of halos and subhalos of same masses. 

We find that the UV luminosity and halo masses scale 
roughly linearly as Luv oc M°'^~^'^ for the majority of 
galaxies {Ljjv ^ L*) regardless of a specific choice of 
the duty cycle value (Figures O -[7]). The approximately 
constant faint-end slope a «-1.7 of the LF observed from 
redshift 3 out to 6 is a direct result of this linear scaling 
law, suggesting that the same star formation physics is 
at work throughout these epochs. On the other hand, 
the amplitude of the scaling law seems to change mildly 
with redshift in such a way that UV luminosity for a 
fixed halo mass M was higher at earlier times by a few 
tenths of magnitude (Figure [8]). Our results are in ac- 
cord with a similar finding that when galaxy samples 
at 5; ~ 3, 4, and 5 were defined with the same abso- 
lute luminosity threshold, the Vgoe-band dropout sample 
has an average halo bias consistent with a low er median 
halo mass than its lower redshift counterparts (|Lee et al.l 



I2006f) . However, a more robust determination of the 
galaxy duty cycle is needed to quantify how much bright- 
ening or dimming occurs at different redshifts. Such a 
trend may be due to either the buildup of dust with cos- 
mic time (increasing dust obscuration) or, if the amount 
of dust changes little with redsh ift, a higher effic iency of 
star for mation at earlier times ()Lee et al.ll2006f) . How- 
ever, iReddv et al.l ([2008) found from samples of similar 
selected star-forming galaxies at z ~ 2 and 3 that the 
amount of dust obscuration does not change significantly 
at those redshifts within the dynamic range of the UV 
colors allowed by the selection criteria. 

The observed evolution of the UV LF can be under- 
stood in the context of the evolution of the L-M relation 
and the halo mass function with redshift. We interpolate 
the characteristic luminosity L* at each redshift bin to 
define a characteristic mass M^. In this interpretation, 
the brightening of the characteristic luminosity L* with 
time translates into the increase in the characteristic halo 
masses with cosmic time (see Figure[8]). The increase 
of the characteristic mass and the decrease of Luv 
for a fixed mass with cosmic time take place in a way 
that yielded little change in the normalization param- 
eter (/)*, or the number density of halos at the charac- 
teristic mass. Hence, the normalization parameter (jf is 
the result of two competing forces: the evolution of L- 
M relation {Luv dims with time for a fixed mass), and 
the evolution of halo mass function (the ever-increasing 
number density of halos with time for any fixed mass). 

7.3. The Nature of Star Formation at High Redshift 

We investigated the nature of star formation in high- 
redshift galaxies, namely, whether they are dominated 
by a small fraction of halos bursting with star formation, 
or rather most galaxies are lit up by a continuous sup- 
ply of gas accretion into the halo potential wells. We 
explored a wide range of scaling laws for the mean C{M) 
as well as the L-M scatter a-L{M), and defined the burst 
parameter B{M) to be the ratio of the latter to the for- 
mer. The B parameter is an indicator of the range of the 
achievable UV luminosity with respect to the mean for a 
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Fig. 16. — Galaxy cross-correlation function for very bright galaxies: We demonstrate that when galaxies that are brighter 
than ft! L* are used as the bright sample for the cross-correlation function, different physical scenarios clearly show different behavior at 
small angular separations. Shown in two panels are the model predictions of the one- and two-halo terms for "DC = 50, 25% cases at 2: ~ 4. 
The bottom left corner indicates the luminosity threshold used to define the "bright" samples. While the median luminosity of the bright 
sample increases from top to bottom, the same faint sample is used in all three cases (26.0 <Z850< 27.5). We do not show the total CF 
for clarity. Note that the characteristic luminosity reported byLBouwens et al. (2007) is Afiroo = —21.06. As the median luminosity of the 
bright sample increases (while the faint sample includes the rest), the amplitude of the one-halo term differs for the two scenarios. In turn, 
the precise measurement of the XCF in this regime may help constrain the physical model responsible for the main mode of star formation 
at high redshift. 



substantial fraction (the upper 16%) of galaxies hosted in 
halo of mass M. We classified the models that satisfy all 
the observational constraints into three categories, each 
painting a very different physical picture. 

In the first scenario, the B parameter declines mono- 
tonically with halo mass (Figure fT3|. The physical inter- 
pretation is that high-mass halos have steady accretion of 
gas that constantly replenishes the material for star for- 
mation, corresponding to relatively constant star forma- 
tion histories (characterized by long e-folding time, tsf), 
hence a very low B value. For low-mass halos, however, 
the gas accretion is not as steady as high-mass ones, and 
thus, the star formation history of a halo is described by 
a shorter time scale tsf on average. When averaged over 
an ensemble of halos of similar masses, each of which un- 
dergoes a SF episode at a different time, the median star 
formation rate £ is lower than high-mass halos while the 
variance ctl is high, hence a high B parameter. This sce- 
nario is in qualitative agreement with the current frame- 
work of galaxy formation, more massive halos have higher 
infall rates than less massive ones. This is also in qualita- 
tive agr eement with a high-reso lution hydrodynamic sim- 
ulation (jNagamine et al.l 120071 ). An alternative scenario 
can be considered, in which star formation is temporarily 
quenched in low-mass halos due to supernova feedback 
recurrently, resulting in episodic star formation. 

The second scenario depicts an entirely different phys- 



ical process where the B parameter mildly increases with 
mass M (see Figure [T3| . Our data places a strong con- 
straint on the logarithmic slope of this increase, such that 
the slope has to be very shallow to avoid contradiction 
with the observed luminosity-dependent clustering. The 
maximum slope allowed from the data is B{M) oc M"-^*. 
One plausible physical interpretation of this behavior is a 
merger-induced star formation (e.g.. iKolatt et al.l Il999t 
ISomerville et all 120011 : Idi Matteo et al.l |2007[ ) combined 
with a very steady inflow of gas at all masses. The fact 
that the minor/major merger rate is higher at higher 
masses may explain the increase of the UV luminosity. 
However, the /^-parameter represents the astrophysical 
aspect of merger events, so it is unclear if such a shal- 
low slope is in agreement with analogous predictions 
from semi-analytical models or hydrodynamic simula- 
tions. Another problem with this scenario is that the 
negligible amount of scatter in low-mass halos requires an 
extremely steady flow of cold gas even for very low-mass 
halos (M < 10i°/i"iMo). This may not be consistent 
with cosmological DM simulations. It will be interesting 
to estimate an infall rate of dark matter into a range of 
halo masses, and convert the DM infall rate to that of 
gas by using the baryonic matter density fib- This will 
allow us to make a rough estimate of the cosmologically 
consiste nt B parameter for low masses as well as hig h 
masses (|Guo fc Whitd[2008l : [Ccmrov fc Wechslei]l200aD . 
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The last scenario is a hybrid between the first two such 
that the B parameter reaches a minimum at an inter- 
mediate mass range (« 10^^-^ h~^MQ: Figure [T^ . In 
much the same way as the first scenario, the gas accre- 
tion is stochastic at low masses (resulting in large val- 
ues of the B parameter), while at high masses, the con- 
tribution from the merger-induced SF goes up similar 
to the second scenario. Again, the logarithmic slope at 
high masses needs to be very shallow — B{M) (x M'^-^^ or 
shallower — to be consistent with the data. In any case, 
our data suggests that the merger-induced star formation 
cannot be the primary mechanism to produce UV-bright 
star- forming galax i es. O ur conclusion is in agreement 
with lConrov et al] ()2008[ ). who argued based on the halo 
merger tree that the number density of SF galaxies at 
2 ~ 2 is much higher than that of major/minor merger 
events at the same epoch to have produced these galax- 
ies. 

7.4. Future Directions 

The formalism we have presented offers a power- 
ful framework for determining the connection between 
galaxies and dark matter halos at high redshift, and po- 
tentially for providing insight about the nature of high 
redshift star formation. With current data we were 
able to constrain the typical luminosities of high redshift 
galaxies at fixed halo mass fairly well, however, we were 
unable to put tight constraints on which physical sce- 
narios dominate the scatter in UV light between galaxies 
at fixed mass. The limitation mainly comes from large 
uncertainties in the determination of the large-scale clus- 
tering strength (or the average halo bias), and the small 
area of the data sample, which covers a total of 300 
arcmin^. While the current data provides an excellent 
representation of relatively faint galaxies which are most 
common in the high-redshift universe, it only provides 
a handful of bright {Ljjv ^ L*) galaxies where differ- 
ences between different physical models begin to emerge 
from the shape of the galaxy correlation functions, and 
the strong luminosity-dependent bias. We conclude by 
demonstrating for future surveys the type of the obser- 
vational measures to be made, in order to discriminate 
these physical scenarios, namely, the bright-faint galaxy 
cross-correlation function (Figure I16p and luminosity- 
dependent halo bias (Figure fT4|) . 

8. CONCLUSIONS 

We have used the observed UV LF and correlation 
function measures for star-forming galaxies at z ~ 4, 5, 
and 6 to infer the nature of star formation and its depen- 
dence on halo mass, in particular for the sub-L* galaxies. 
The main conclusions from this work are as follows: 

1. The star formation duty cycle of Lyman- break 
galaxies should be less than < 0.35 Gyr at both z ^ A 
and 5. The best-fit duty cycle value for z ~ 4 is 15%- 
60% (Icr), and < 70% for z - 5. The relatively short 
time scale during which galaxies are visible in the UV 
implies that the galaxies observed at z ~ 4 are unlikely 



to be the direct descendants of those at z '--^ 5 unless the 
star formation is recurrent after a long intermission. 

2. The observed UV luminosity scales approximately lin- 
early with the halo mass in order to reproduce the faint- 
end slope of the UV LF a «-1.7 observed at z ~ 4 — 6, for 
galaxies less luminous than the characteristic value L* . 
In this interpretation, the constant faint-end slope with 
redshift is a direct result of, 1) the low-mass slope of the 
total halo mass function remains constant with redshift, 
and 2) the observed UV luminosity scales with the halo 
mass with a power-law slope close to unity (a = 0.9-1.2) 
at z—A-Q. 

3. While the slope of the L-M scaling law does not 
change with redshift, the amplitude of the relation de- 
creases with cosmic time, such that for a fixed halo mass, 
z ~ 5 galaxies appear brighter by « 0.3 mag than z ^ 4 
galaxies. If the dust properties do not change signifi- 
cantly at those redshifts, this implies that star formation 
efficiency p er halo ma, s s was higher at earlier times con- 
sistent with lLee et al.l ([2006') results. 

4. We interpret the nonevolution of the normalization 
parameter </)* with redshift observed at z ~4-6 as a result 
of the two competing processes canceling each other: the 
number density of halos for a fixed halo mass increases 
with time, while the average UV luminosity in halos of a 
fixed halo mass decreases with time. 

5. The star formation in massive halos {M > 
lO^° */i~^M0) should be relatively quiescent, and thus 
can be described by a slowly varying star formation his- 
tory. The degree of burst can be a mildly varying func- 
tion of halo mass at this regime, and it may be attributed 
to the merger-induced star formation in massive halos (as 
the halo merger rate is also a mildly increasing function of 
mass). Data from wide-field surveys are crucially needed 
to quantify the contribution from bursty star formation 
in further detail. 

6. The average star formation histories in low-mass halos 
(M < lO^°'^/i~"'^M0) is not as well constrained from the 
current data mainly due to the uncertainties in the true 
large-scale bias. The main mode of star formation at this 
regime is crucial to understand the formation histories of 
the majority of galaxies detected in the rest-UV surveys: 
whether they are forming stars as quiescently as their 
brighter counterparts (Scenario 2), or they represent a 
small fraction of low-mass halos undergoing "bursty" star 
formation (Scenarios 1 and 3). 
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